The original preregistrations for the studies contained both hypotheses and the specific analytic strategies that would be used to test them. However, these preregistrations did not include a meta-analytic strategy. Separately, a number of research questions/hypotheses were generated from exploration of the data from Experiments 1-6 that were not contained in the original preregistration, or where the specific analytic strategy to test them was poorly specified or more difficult to interpret. Separately, some methodological improvements were thought of after Experiments 1-6 was run (e.g., improved exclusion criteria to ensure participants stayed on the page where they watched/listened to the intervention in its entirety). We therefore elected to use the data from Experiments 1-6 to create this (non-preregistered) alternative analytic strategy that formalized our core research questions, hypotheses, analytic models, inference rules, and other researcher degrees of freedom. This analytic strategy (and code to implement it) will be preregistered for Experiment 7 which will provide strong confirmatory tests of these hypotheses.
# dependencies
library(tidyverse)
library(knitr)
library(kableExtra)
library(brms)
library(parallel)
library(tidybayes)
library(bayestestR)
library(sjPlot)
library(psych)
library(rsample)
library(broom)
library(purrr)
library(IATscores)
library(lavaan)
library(semTools)
library(modelr)
library(furrr)
# set up parallel processing
future::plan(multiprocess)
# knitr options
options(knitr.kable.NA = "/")
# set seed for bootstrapping reproducibility
set.seed(42)
# create necessary folder
dir.create("models")All dependent variables (self-reported evaluations and IAT D2 scores) were standardized (by 1 SD) after exclusions and prior to analysis condition (see Lorah, 2018: https://doi.org/10.1186/s40536-018-0061-2). This was done within each level of both IV (i.e., by Source Valence condition [positive vs. negative], and by Video Content [Genuine vs. Deepfaked]). As such, the beta estimates obtained from the Bayesian models (see research questions and data analysis plans below) therefore represent standardized beta values (\(\beta\) rather than \(B\)). More importantly, the nature of this standardization makes these estimates somewhat comparable to the frequentist standardized effect size metric Cohen’s \(d\), as both are a differences in (estimated) means as a proportion of SD although they should not be treated as equivalent. Effect size magnitude here can therefore be thought of along comparable scales as Cohen’s \(d\). As such, to aid interpretability, the point estimates of effect size will be reported as \(\delta\) (delta).
# full data
data_processed <- read.csv("../data/processed/4_data_participant_level_with_hand_scoring.csv") %>%
# include only experiments 1-6
filter(experiment %in% c(1, 2, 3, 4, 5, 6)) %>%
# set factor levels for t test comparisons
mutate(source_valence = fct_relevel(source_valence,
"negative",
"positive"),
experiment_condition = fct_relevel(experiment_condition,
"genuine",
"deepfaked"),
experiment = as.factor(experiment))
# apply exclusions
data_after_exclusions <- data_processed %>%
filter(exclude_subject == FALSE &
exclude_implausible_intervention_linger == FALSE) %>%
# standardize DVs by 1SD within each experiment and their conditions
group_by(experiment, experiment_condition, source_valence) %>%
mutate(mean_self_reported_evaluation = mean_self_reported_evaluation/sd(mean_self_reported_evaluation),
IAT_D2 = IAT_D2/sd(IAT_D2),
mean_intentions = mean_intentions/sd(mean_intentions)) %>%
ungroup()
# item level for iat
data_iat_item_level_after_exclusions <- read_csv("../data/processed/2.4_data_iat_item_level.csv") %>%
# exclude the same participants as above
semi_join(rename(data_after_exclusions, subject_original = subject), by = "subject_original") ggplot(data_after_exclusions, aes(mean_self_reported_evaluation, color = experiment)) +
geom_density() +
facet_wrap( ~ experiment_condition + source_valence) +
ggtitle("Standardized scores")ggplot(data_after_exclusions, aes(IAT_D2, color = experiment)) +
geom_density() +
facet_wrap( ~ experiment_condition + source_valence) +
ggtitle("Standardized scores")ggplot(data_after_exclusions, aes(mean_intentions, color = experiment)) +
geom_density() +
facet_wrap( ~ experiment_condition + source_valence) +
ggtitle("Standardized scores")data_processed %>%
group_by(experiment) %>%
summarise(n = n(),
excluded_n = sum(exclude_subject > 0 | exclude_implausible_intervention_linger > 0),
excluded_percent = (excluded_n / n) *100) %>%
mutate_if(is.numeric, round, digits = 1) %>%
kable(align = "c") %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
full_width = FALSE)| experiment | n | excluded_n | excluded_percent |
|---|---|---|---|
| 1 | 165 | 24 | 14.5 |
| 2 | 167 | 36 | 21.6 |
| 3 | 428 | 91 | 21.3 |
| 4 | 429 | 106 | 24.7 |
| 5 | 276 | 66 | 23.9 |
| 6 | 265 | 61 | 23.0 |
data_after_exclusions %>%
group_by(experiment) %>%
summarise(n = n(),
age_mean = mean(age, na.rm = TRUE),
age_sd = sd(age, na.rm = TRUE)) %>%
mutate_if(is.numeric, round, digits = 1) %>%
kable(align = "c") %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
full_width = FALSE)| experiment | n | age_mean | age_sd |
|---|---|---|---|
| 1 | 141 | 29.7 | 7.6 |
| 2 | 131 | 31.1 | 7.3 |
| 3 | 337 | 29.8 | 8.7 |
| 4 | 323 | 30.1 | 9.0 |
| 5 | 210 | 31.2 | 11.5 |
| 6 | 204 | 33.3 | 12.5 |
data_after_exclusions %>%
count(experiment, gender) %>%
spread(gender, n) %>%
kable(knitr.kable.NA = "/", align = "c") %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
full_width = FALSE)| experiment | female | male | Non-binary | other | Prefer not to disclose |
|---|---|---|---|---|---|
| 1 | 67 | 73 | / | 1 | / |
| 2 | 76 | 55 | / | / | / |
| 3 | 184 | 149 | / | 4 | / |
| 4 | 189 | 132 | / | 2 | / |
| 5 | 119 | 88 | 2 | / | 1 |
| 6 | 120 | 82 | 2 | / | / |
model_sr <- "scale =~ ratings_bad_good + ratings_dislike_like + ratings_negative_positive"
fit_cfa_sr <- data_after_exclusions %>%
cfa(model = model_sr, data = .)
results_reliability_sr <- fit_cfa_sr %>%
reliability() %>%
as.data.frame() %>%
rownames_to_column(var = "metric") %>%
select(metric, estimate = scale) %>%
filter(metric %in% c("alpha",
"omega2")) %>%
mutate(metric = recode(metric,
"alpha" = "alpha",
"omega2" = "omega_t"),
estimate = round(estimate, 3))
results_reliability_sr %>%
kable(knitr.kable.NA = "/", align = "c") %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
full_width = FALSE)| metric | estimate |
|---|---|
| alpha | 0.987 |
| omega_t | 0.987 |
split half
results_iat_split_half_reliability <- data_iat_item_level_after_exclusions %>%
SplitHalf.D2(IATdata = .) %>%
mutate(algorithm = ifelse(algorithm == "p2112", "D2", algorithm),
splithalf = round(splithalf, 3))## [1] "2020-12-02 01:48:20: Applying parameter P4 = dist"
## [1] "2020-12-02 01:48:20: Applying parameters P1 and P2"
## [1] "2020-12-02 01:48:20: Applying parameter P3 = dscore"
## [1] "2020-12-02 01:48:20: Applying parameters P1 and P2"
## [1] "2020-12-02 01:48:21: Applying parameter P3 = dscore"
## [1] "2020-12-02 01:48:21: IAT scores have been computed"
## [1] "2020-12-02 01:48:21: Applying parameter P4 = dist"
## [1] "2020-12-02 01:48:21: Applying parameters P1 and P2"
## [1] "2020-12-02 01:48:21: Applying parameter P3 = dscore"
## [1] "2020-12-02 01:48:22: Applying parameters P1 and P2"
## [1] "2020-12-02 01:48:22: Applying parameter P3 = dscore"
## [1] "2020-12-02 01:48:22: IAT scores have been computed"
results_iat_split_half_reliability %>%
kable(knitr.kable.NA = "/", align = "c") %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
full_width = FALSE)| algorithm | splithalf |
|---|---|
| D2 | 0.839 |
model_bi <- "scale =~ behavioral_intentions_share + behavioral_intentions_subscribe + behavioral_intentions_recommend"
fit_cfa_bi <- data_after_exclusions %>%
cfa(model = model_bi, data = .)
results_reliability_bi <- fit_cfa_bi %>%
reliability() %>%
as.data.frame() %>%
rownames_to_column(var = "metric") %>%
select(metric, estimate = scale) %>%
filter(metric %in% c("alpha",
"omega2")) %>%
mutate(metric = recode(metric,
"alpha" = "alpha",
"omega2" = "omega_t"),
estimate = round(estimate, 3))
results_reliability_bi %>%
kable(knitr.kable.NA = "/", align = "c") %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
full_width = FALSE)| metric | estimate |
|---|---|
| alpha | 0.941 |
| omega_t | 0.941 |
data_after_exclusions %>%
select(source_valence,
experiment_condition) %>%
drop_na() %>%
count(experiment_condition,
source_valence) %>%
kable() %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
full_width = FALSE)| experiment_condition | source_valence | n |
|---|---|---|
| genuine | negative | 382 |
| genuine | positive | 425 |
| deepfaked | negative | 257 |
| deepfaked | positive | 282 |
fit_exploratory_selfreport <-
brm(formula = mean_self_reported_evaluation ~ source_valence * experiment_condition + (1 | experiment),
family = gaussian(),
data = data_after_exclusions,
file = "models/fit_exploratory_selfreport",
prior = prior(normal(0, 10)),
iter = 10000,
warmup = 3000,
control = list(adapt_delta = 0.99), # to avoid divergent transitions
chains = 4,
cores = parallel::detectCores())## Family: gaussian
## Links: mu = identity; sigma = identity
## Formula: mean_self_reported_evaluation ~ source_valence * experiment_condition + (1 | experiment)
## Data: data_after_exclusions (Number of observations: 1346)
## Samples: 4 chains, each with iter = 10000; warmup = 3000; thin = 1;
## total post-warmup samples = 28000
##
## Group-Level Effects:
## ~experiment (Number of levels: 6)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 0.18 0.11 0.06 0.46 1.00 6283 9291
##
## Population-Level Effects:
## Estimate Est.Error
## Intercept -1.48 0.10
## source_valencepositive 2.71 0.07
## experiment_conditiondeepfaked 0.08 0.09
## source_valencepositive:experiment_conditiondeepfaked 0.01 0.11
## l-95% CI u-95% CI Rhat
## Intercept -1.68 -1.28 1.00
## source_valencepositive 2.56 2.85 1.00
## experiment_conditiondeepfaked -0.09 0.25 1.00
## source_valencepositive:experiment_conditiondeepfaked -0.22 0.23 1.00
## Bulk_ESS Tail_ESS
## Intercept 7616 9466
## source_valencepositive 15458 18050
## experiment_conditiondeepfaked 14302 17335
## source_valencepositive:experiment_conditiondeepfaked 13679 17151
##
## Family Specific Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma 1.03 0.02 1.00 1.07 1.00 22400 18545
##
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
Using Gelman’s (2019) simple heuristic: For each parameter, compare the posterior sd to the prior sd. If the posterior sd for any parameter is more than 0.1 times the prior sd, then note that the prior was informative.
check_prior(fit_exploratory_selfreport) %>%
kable() %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
full_width = FALSE)| Parameter | Prior_Quality |
|---|---|
| b_Intercept | uninformative |
| b_source_valencepositive | uninformative |
| b_experiment_conditiondeepfaked | uninformative |
| b_source_valencepositive.experiment_conditiondeepfaked | uninformative |
# plot_model(fit_exploratory_selfreport)
plot_model(fit_exploratory_selfreport, type = "pred", terms = c("source_valence", "experiment_condition"))# percent moderation
draws_sr <-
bind_cols(
select(spread_draws(fit_exploratory_selfreport, b_source_valencepositive), b_source_valencepositive),
select(spread_draws(fit_exploratory_selfreport, b_experiment_conditiondeepfaked), b_experiment_conditiondeepfaked),
select(spread_draws(fit_exploratory_selfreport, `b_source_valencepositive:experiment_conditiondeepfaked`), `b_source_valencepositive:experiment_conditiondeepfaked`)
) %>%
rename(main_valence = b_source_valencepositive,
main_experiment_condition = b_experiment_conditiondeepfaked,
interaction = `b_source_valencepositive:experiment_conditiondeepfaked`) %>%
mutate(effect_genuine = main_valence,
effect_deepfaked = main_valence + main_experiment_condition + interaction,
#percent_moderation = (main_experiment_condition + interaction)/main_valence *100, # alt method, same result
percent_comparison = (effect_deepfaked/effect_genuine)*100)
# results
estimates_sr <-
map_estimate(draws_sr) %>%
full_join(bayestestR::hdi(draws_sr, ci = .95) %>%
rename(CI_95_lower = CI_low,
CI_95_upper = CI_high) %>%
as_tibble(),
by = "Parameter") %>%
full_join(bayestestR::hdi(draws_sr, ci = .90) %>%
as_tibble() %>%
rename(CI_90_lower = CI_low,
CI_90_upper = CI_high),
by = "Parameter") %>%
full_join(draws_sr %>%
select(-percent_comparison) %>%
gather(Parameter, value) %>%
group_by(Parameter) %>%
summarize(pd = mean(value > 0)) %>%
mutate(p = pd_to_p(pd, direction = "one-sided")) %>%
ungroup() %>%
select(Parameter, p),
by = "Parameter") %>%
select(Parameter, MAP_Estimate, CI_95_lower, CI_95_upper,
CI_90_lower, CI_90_upper, p)
# results table
estimates_sr %>%
mutate_at(.vars = c("MAP_Estimate", "CI_95_lower", "CI_95_upper", "CI_90_lower", "CI_90_upper"), round, digits = 2) %>%
kable() %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
full_width = FALSE)| Parameter | MAP_Estimate | CI_95_lower | CI_95_upper | CI_90_lower | CI_90_upper | p |
|---|---|---|---|---|---|---|
| main_valence | 2.71 | 2.56 | 2.85 | 2.58 | 2.82 | 0.0000000 |
| main_experiment_condition | 0.09 | -0.09 | 0.25 | -0.06 | 0.23 | 0.1794643 |
| interaction | 0.01 | -0.21 | 0.24 | -0.18 | 0.20 | 0.4717143 |
| effect_genuine | 2.71 | 2.56 | 2.85 | 2.58 | 2.82 | 0.0000000 |
| effect_deepfaked | 2.80 | 2.63 | 2.96 | 2.66 | 2.93 | 0.0000000 |
| percent_comparison | 102.84 | 97.19 | 109.21 | 98.39 | 108.42 | / |
# hypothesis testing
H1a <- ifelse((estimates_sr %>% filter(Parameter == "effect_genuine") %>% pull(CI_95_lower)) > 0,
"Accepted", "Rejected")
H1b <- ifelse((estimates_sr %>% filter(Parameter == "effect_deepfaked") %>% pull(CI_95_lower)) > 0,
"Accepted", "Rejected")
H2a <- ifelse((estimates_sr %>% filter(Parameter == "effect_deepfaked") %>% pull(CI_90_lower)) >
(estimates_sr %>% filter(Parameter == "effect_genuine") %>% pull(CI_95_lower)),
"Accepted", "Rejected")
comparison_string_sr <-
paste0("Deepfakes are ",
estimates_sr %>% filter(Parameter == "percent_comparison") %>% pull(MAP_Estimate) %>% round(1),
"% (95% CI [",
estimates_sr %>% filter(Parameter == "percent_comparison") %>% pull(CI_95_lower) %>% round(1),
", ",
estimates_sr %>% filter(Parameter == "percent_comparison") %>% pull(CI_95_upper) %>% round(1),
"]) as effective as genuine content in establishing self-reported evaluations")H1a
The content of the genuine videos (i.e., Source Valence) will influence participants’ self-reported evaluations. Specifically, we will use a Bayesian linear model (model 1) to estimate a 95% Confidence Interval on standardized effect size change in self-reported evaluations between Source Valence conditions in the genuine video condition subgroup. Confidence Intervals whose lower bounds are > 0 will be considered evidence in support of this hypothesis.
H1b
The content of the Deepfaked videos (i.e., Source Valence) will influence participants’ self-reported evaluations. Specifically, we will use a Bayesian linear model (model 1) to estimate a 95% Confidence Interval on standardized effect size change in self-reported evaluations between Source Valence conditions in the Deepfaked video condition subgroup. Confidence Intervals whose lower bounds are > 0 will be considered evidence in support of this hypothesis.
H2a
Change in self-reported evaluations (i.e., between Source Valence conditions) induced by Deepfaked video content will be non-inferior to genuine content.
fit_exploratory_implicit <-
brm(formula = IAT_D2 ~ source_valence * experiment_condition + (1 | experiment),
family = gaussian(),
data = data_after_exclusions,
file = "models/fit_exploratory_implicit",
prior = prior(normal(0, 10)),
iter = 10000,
warmup = 3000,
control = list(adapt_delta = 0.99), # to avoid divergent transitions
chains = 4,
cores = parallel::detectCores())## Family: gaussian
## Links: mu = identity; sigma = identity
## Formula: IAT_D2 ~ source_valence * experiment_condition + (1 | experiment)
## Data: data_after_exclusions (Number of observations: 1346)
## Samples: 4 chains, each with iter = 10000; warmup = 3000; thin = 1;
## total post-warmup samples = 28000
##
## Group-Level Effects:
## ~experiment (Number of levels: 6)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 0.20 0.12 0.07 0.50 1.00 5922 9310
##
## Population-Level Effects:
## Estimate Est.Error
## Intercept 0.05 0.11
## source_valencepositive 1.32 0.07
## experiment_conditiondeepfaked 0.07 0.08
## source_valencepositive:experiment_conditiondeepfaked -0.00 0.11
## l-95% CI u-95% CI Rhat
## Intercept -0.17 0.26 1.00
## source_valencepositive 1.19 1.46 1.00
## experiment_conditiondeepfaked -0.10 0.23 1.00
## source_valencepositive:experiment_conditiondeepfaked -0.22 0.22 1.00
## Bulk_ESS Tail_ESS
## Intercept 7697 8823
## source_valencepositive 16144 18114
## experiment_conditiondeepfaked 15002 18404
## source_valencepositive:experiment_conditiondeepfaked 14394 17046
##
## Family Specific Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma 1.00 0.02 0.97 1.04 1.00 21725 18539
##
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
Using Gelman’s (2019) simple heuristic: For each parameter, compare the posterior sd to the prior sd. If the posterior sd for any parameter is more than 0.1 times the prior sd, then note that the prior was informative.
check_prior(fit_exploratory_implicit) %>%
kable() %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
full_width = FALSE)| Parameter | Prior_Quality |
|---|---|
| b_Intercept | uninformative |
| b_source_valencepositive | uninformative |
| b_experiment_conditiondeepfaked | uninformative |
| b_source_valencepositive.experiment_conditiondeepfaked | uninformative |
#plot_model(fit_exploratory_implicit)
plot_model(fit_exploratory_implicit, type = "pred", terms = c("source_valence", "experiment_condition"))# percent moderation
draws_imp <-
bind_cols(
select(spread_draws(fit_exploratory_implicit, b_source_valencepositive), b_source_valencepositive),
select(spread_draws(fit_exploratory_implicit, b_experiment_conditiondeepfaked), b_experiment_conditiondeepfaked),
select(spread_draws(fit_exploratory_implicit, `b_source_valencepositive:experiment_conditiondeepfaked`), `b_source_valencepositive:experiment_conditiondeepfaked`)
) %>%
rename(main_valence = b_source_valencepositive,
main_experiment_condition = b_experiment_conditiondeepfaked,
interaction = `b_source_valencepositive:experiment_conditiondeepfaked`) %>%
mutate(effect_genuine = main_valence,
effect_deepfaked = main_valence + main_experiment_condition + interaction,
#percent_moderation = (main_experiment_condition + interaction)/main_valence *100, # alt method, same result
percent_comparison = (effect_deepfaked/effect_genuine)*100)
# results table
estimates_imp <-
map_estimate(draws_imp) %>%
full_join(bayestestR::hdi(draws_imp, ci = .95) %>%
rename(CI_95_lower = CI_low,
CI_95_upper = CI_high) %>%
as_tibble(),
by = "Parameter") %>%
full_join(bayestestR::hdi(draws_imp, ci = .90) %>%
as_tibble() %>%
rename(CI_90_lower = CI_low,
CI_90_upper = CI_high),
by = "Parameter") %>%
full_join(draws_imp %>%
select(-percent_comparison) %>%
gather(Parameter, value) %>%
group_by(Parameter) %>%
summarize(pd = mean(value > 0)) %>%
mutate(p = pd_to_p(pd, direction = "one-sided")) %>%
ungroup() %>%
select(Parameter, p),
by = "Parameter") %>%
select(Parameter, MAP_Estimate, CI_95_lower, CI_95_upper,
CI_90_lower, CI_90_upper, p)
estimates_imp %>%
mutate_at(.vars = c("MAP_Estimate", "CI_95_lower", "CI_95_upper", "CI_90_lower", "CI_90_upper"), round, digits = 2) %>%
kable() %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
full_width = FALSE)| Parameter | MAP_Estimate | CI_95_lower | CI_95_upper | CI_90_lower | CI_90_upper | p |
|---|---|---|---|---|---|---|
| main_valence | 1.33 | 1.18 | 1.46 | 1.21 | 1.44 | 0.0000000 |
| main_experiment_condition | 0.07 | -0.10 | 0.23 | -0.07 | 0.21 | 0.1996786 |
| interaction | 0.00 | -0.23 | 0.21 | -0.18 | 0.19 | 0.4972857 |
| effect_genuine | 1.33 | 1.18 | 1.46 | 1.21 | 1.44 | 0.0000000 |
| effect_deepfaked | 1.41 | 1.23 | 1.55 | 1.26 | 1.53 | 0.0000000 |
| percent_comparison | 104.46 | 93.73 | 117.95 | 95.34 | 115.66 | / |
# hypothesis testing
H1c <- ifelse((estimates_imp %>% filter(Parameter == "effect_genuine") %>% pull(CI_95_lower)) > 0,
"Accepted", "Rejected")
H1d <- ifelse((estimates_imp %>% filter(Parameter == "effect_deepfaked") %>% pull(CI_95_lower)) > 0,
"Accepted", "Rejected")
H2b <- ifelse((estimates_imp %>% filter(Parameter == "effect_deepfaked") %>% pull(CI_90_lower)) >
(estimates_imp %>% filter(Parameter == "effect_genuine") %>% pull(CI_95_lower)),
"Accepted", "Rejected")
comparison_string_imp <-
paste0("Deepfakes are ",
estimates_imp %>% filter(Parameter == "percent_comparison") %>% pull(MAP_Estimate) %>% round(1),
"% (95% CI [",
estimates_imp %>% filter(Parameter == "percent_comparison") %>% pull(CI_95_lower) %>% round(1),
", ",
estimates_imp %>% filter(Parameter == "percent_comparison") %>% pull(CI_95_upper) %>% round(1),
"]) as effective as genuine content in establishing self-reported evaluations")H1c
The content of the genuine videos (i.e., Source Valence) will influence participants’ IAT D2 scores. Specifically, we will use a Bayesian linear model (model 2) to estimate a 95% Confidence Interval on standardized effect size change in IAT D2 scores between Source Valence conditions in the genuine video condition subgroup. Confidence Intervals whose lower bounds are > 0 will be considered evidence in support of this hypothesis.
H1d
The content of the Deepfaked videos (i.e., Source Valence) will influence participants’ IAT D2 scores. Specifically, we will use a Bayesian linear model (model 2) to estimate a 95% Confidence Interval on standardized effect size change in IAT D2 scores between Source Valence conditions in the Deepfaked video condition subgroup. Confidence Intervals whose lower bounds are > 0 will be considered evidence in support of this hypothesis.
H2b
Change in IAT D2 scores (i.e., between Source Valence conditions) induced by Deepfaked video content will be non-inferior to genuine content.
fit_exploratory_intentions <-
brm(formula = mean_intentions ~ source_valence * experiment_condition, # no random effect for experiment as only exp 6 assessed intentions
family = gaussian(),
data = data_after_exclusions,
file = "models/fit_exploratory_intentions",
prior = prior(normal(0, 10)),
iter = 10000,
warmup = 3000,
control = list(adapt_delta = 0.99), # to avoid divergent transitions
chains = 4,
cores = parallel::detectCores())## Family: gaussian
## Links: mu = identity; sigma = identity
## Formula: mean_intentions ~ source_valence * experiment_condition
## Data: data_after_exclusions (Number of observations: 204)
## Samples: 4 chains, each with iter = 10000; warmup = 3000; thin = 1;
## total post-warmup samples = 28000
##
## Population-Level Effects:
## Estimate Est.Error
## Intercept -1.64 0.14
## source_valencepositive 1.12 0.20
## experiment_conditiondeepfaked -1.70 0.21
## source_valencepositive:experiment_conditiondeepfaked 1.95 0.28
## l-95% CI u-95% CI Rhat
## Intercept -1.92 -1.36 1.00
## source_valencepositive 0.72 1.52 1.00
## experiment_conditiondeepfaked -2.10 -1.29 1.00
## source_valencepositive:experiment_conditiondeepfaked 1.38 2.50 1.00
## Bulk_ESS Tail_ESS
## Intercept 13786 17130
## source_valencepositive 12197 15327
## experiment_conditiondeepfaked 11750 16015
## source_valencepositive:experiment_conditiondeepfaked 10334 14695
##
## Family Specific Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma 1.01 0.05 0.91 1.11 1.00 18031 17126
##
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
Using Gelman’s (2019) simple heuristic: For each parameter, compare the posterior sd to the prior sd. If the posterior sd for any parameter is more than 0.1 times the prior sd, then note that the prior was informative.
check_prior(fit_exploratory_intentions) %>%
kable() %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
full_width = FALSE)| Parameter | Prior_Quality |
|---|---|
| b_Intercept | uninformative |
| b_source_valencepositive | uninformative |
| b_experiment_conditiondeepfaked | uninformative |
| b_source_valencepositive.experiment_conditiondeepfaked | uninformative |
#plot_model(fit_exploratory_intentions)
plot_model(fit_exploratory_intentions, type = "pred", terms = c("source_valence", "experiment_condition"))# percent moderation
draws_intentions <-
bind_cols(
select(spread_draws(fit_exploratory_intentions, b_source_valencepositive), b_source_valencepositive),
select(spread_draws(fit_exploratory_intentions, b_experiment_conditiondeepfaked), b_experiment_conditiondeepfaked),
select(spread_draws(fit_exploratory_intentions, `b_source_valencepositive:experiment_conditiondeepfaked`), `b_source_valencepositive:experiment_conditiondeepfaked`)
) %>%
rename(main_valence = b_source_valencepositive,
main_experiment_condition = b_experiment_conditiondeepfaked,
interaction = `b_source_valencepositive:experiment_conditiondeepfaked`) %>%
mutate(effect_genuine = main_valence,
effect_deepfaked = main_valence + main_experiment_condition + interaction,
#percent_moderation = (main_experiment_condition + interaction)/main_valence *100, # alt method, same result
percent_comparison = (effect_deepfaked/effect_genuine)*100)
# results table
estimates_intentions <-
map_estimate(draws_intentions) %>%
full_join(bayestestR::hdi(draws_intentions, ci = .95) %>%
rename(CI_95_lower = CI_low,
CI_95_upper = CI_high) %>%
as_tibble(),
by = "Parameter") %>%
full_join(bayestestR::hdi(draws_intentions, ci = .90) %>%
as_tibble() %>%
rename(CI_90_lower = CI_low,
CI_90_upper = CI_high),
by = "Parameter") %>%
full_join(draws_intentions %>%
select(-percent_comparison) %>%
gather(Parameter, value) %>%
group_by(Parameter) %>%
summarize(pd = mean(value > 0)) %>%
mutate(p = pd_to_p(pd, direction = "one-sided")) %>%
ungroup() %>%
select(Parameter, p),
by = "Parameter") %>%
select(Parameter, MAP_Estimate, CI_95_lower, CI_95_upper,
CI_90_lower, CI_90_upper, p)
estimates_intentions %>%
mutate_at(.vars = c("MAP_Estimate", "CI_95_lower", "CI_95_upper", "CI_90_lower", "CI_90_upper"), round, digits = 2) %>%
kable() %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
full_width = FALSE)| Parameter | MAP_Estimate | CI_95_lower | CI_95_upper | CI_90_lower | CI_90_upper | p |
|---|---|---|---|---|---|---|
| main_valence | 1.11 | 0.73 | 1.53 | 0.79 | 1.46 | 0 |
| main_experiment_condition | -1.70 | -2.09 | -1.28 | -2.04 | -1.36 | 0 |
| interaction | 1.95 | 1.39 | 2.50 | 1.48 | 2.41 | 0 |
| effect_genuine | 1.11 | 0.73 | 1.53 | 0.79 | 1.46 | 0 |
| effect_deepfaked | 1.37 | 0.99 | 1.76 | 1.05 | 1.69 | 0 |
| percent_comparison | 118.40 | 85.90 | 168.87 | 91.34 | 158.40 | / |
# hypothesis testing
H1e <- ifelse((estimates_intentions %>% filter(Parameter == "effect_genuine") %>% pull(CI_95_lower)) > 0,
"Accepted", "Rejected")
H1f <- ifelse((estimates_intentions %>% filter(Parameter == "effect_deepfaked") %>% pull(CI_95_lower)) > 0,
"Accepted", "Rejected")
H2c <- ifelse((estimates_intentions %>% filter(Parameter == "effect_deepfaked") %>% pull(CI_90_lower)) >
(estimates_intentions %>% filter(Parameter == "effect_genuine") %>% pull(CI_95_lower)),
"Accepted", "Rejected")
comparison_string_intentions <-
paste0("Deepfakes are ",
estimates_intentions %>% filter(Parameter == "percent_comparison") %>% pull(MAP_Estimate) %>% round(1),
"% (95% CI [",
estimates_intentions %>% filter(Parameter == "percent_comparison") %>% pull(CI_95_lower) %>% round(1),
", ",
estimates_intentions %>% filter(Parameter == "percent_comparison") %>% pull(CI_95_upper) %>% round(1),
"]) as effective as genuine content in establishing self-reported evaluations")H1e
The content of the genuine videos (i.e., Source Valence) will influence participants’ behavioral intention responses. Specifically, we will use a Bayesian linear model (model 2) to estimate a 95% Confidence Interval on standardized effect size change in behavioral intention scores between Source Valence conditions in the genuine video condition subgroup. Confidence Intervals whose lower bounds are > 0 will be considered evidence in support of this hypothesis.
H1f
The content of the Deepfaked videos (i.e., Source Valence) will influence participants’ behavioral intention responses. Specifically, we will use a Bayesian linear model (model 2) to estimate a 95% Confidence Interval on standardized effect size change in behavioral intention scores between Source Valence conditions in the Deepfaked video condition subgroup. Confidence Intervals whose lower bounds are > 0 will be considered evidence in support of this hypothesis.
H2c
Change in behavioral intentions (i.e., between Source Valence conditions) induced by Deepfaked video content will be non-inferior to genuine content.
data_after_exclusions %>%
count(deepfake_detection_open_recoded,
deepfake_detection_rater_1,
deepfake_detection_rater_2) %>%
kable() %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
full_width = FALSE)| deepfake_detection_open_recoded | deepfake_detection_rater_1 | deepfake_detection_rater_2 | n |
|---|---|---|---|
| FALSE | FALSE | FALSE | 414 |
| FALSE | FALSE | TRUE | 16 |
| FALSE | TRUE | FALSE | 31 |
| TRUE | TRUE | TRUE | 115 |
| / | / | / | 770 |
data_after_exclusions %>%
summarize(percent_agreement = round(mean(deepfake_detection_rater_1 == deepfake_detection_rater_2, na.rm = TRUE)*100, 1)) %>%
kable() %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
full_width = FALSE)| percent_agreement |
|---|
| 91.8 |
data_after_exclusions %>%
select(deepfake_detection_rater_1,
deepfake_detection_rater_2) %>%
as.data.frame() %>% # kappa function won't take tibbles
psych::cohen.kappa(.)## Call: cohen.kappa1(x = x, w = w, n.obs = n.obs, alpha = alpha, levels = levels)
##
## Cohen Kappa and Weighted Kappa correlation coefficients and confidence boundaries
## lower estimate upper
## unweighted kappa 0.72 0.78 0.84
## weighted kappa 0.72 0.78 0.84
##
## Number of subjects = 576
Interpretation of Kappa (Altman 1999, Landis JR, 1977):
data_after_exclusions %>%
count(experiment_condition,
deepfake_detection_open_recoded) %>%
drop_na() %>%
kable() %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
full_width = FALSE)| experiment_condition | deepfake_detection_open_recoded | n |
|---|---|---|
| genuine | FALSE | 188 |
| genuine | TRUE | 16 |
| deepfaked | FALSE | 273 |
| deepfaked | TRUE | 99 |
data_counts <- data_after_exclusions %>%
select(experiment_condition, deepfake_detection_open_recoded) %>%
drop_na() %>%
count(experiment_condition, deepfake_detection_open_recoded)
TP <- pull(filter(data_counts, experiment_condition == "deepfaked" & deepfake_detection_open_recoded == TRUE), n)
FP <- pull(filter(data_counts, experiment_condition == "genuine" & deepfake_detection_open_recoded == TRUE), n)
FN <- pull(filter(data_counts, experiment_condition == "deepfaked" & deepfake_detection_open_recoded == FALSE), n)
TN <- pull(filter(data_counts, experiment_condition == "genuine" & deepfake_detection_open_recoded == FALSE), n)
#accuracy <- (TP+TN)/(TP+TN+FP+FN)
sensitivity <- TP / (TP+FN)
false_negative_rate <- 1 - sensitivity
specificity <- TN / (TN+FP)
false_positive_rate <- 1 - specificity
# Youden's J statistic aka informedness aka "the probability of an informed decision (as opposed to a random guess) and takes into account all predictions". a zero value when a diagnostic test gives the same proportion of positive results for groups with and without the disease, i.e the test is useless.
informedness <- sensitivity + specificity - 1
balanced_accuracy <- (sensitivity + specificity)/2
fit_exploratory_classification <-
tibble(variable = c(
#"accuracy",
"balanced_accuracy",
"informedness",
#"sensitivity",
"false_negative_rate",
#"specificity",
"false_positive_rate"
),
observed = c(
#accuracy,
balanced_accuracy,
informedness,
#sensitivity,
false_negative_rate,
#specificity,
false_positive_rate
))if(file.exists("models/fit_exploratory_classification_bootstraps.rds")){
fit_exploratory_classification_bootstraps <- read_rds("models/fit_exploratory_classification_bootstraps.rds")
} else {
# create bootstraps using out of bag method. makes a df with values that are collapsed dfs.
boots <- data_after_exclusions %>%
select(experiment_condition, deepfake_detection_open_recoded) %>%
drop_na() %>%
bootstraps(times = 2000)
# generalize to a summarize function ------
bootstrap_categorization_stats <- function(split) {
data_counts <- analysis(split) %>%
count(experiment_condition, deepfake_detection_open_recoded)
TP <- pull(filter(data_counts, experiment_condition == "deepfaked" & deepfake_detection_open_recoded == TRUE), n)
FP <- pull(filter(data_counts, experiment_condition == "genuine" & deepfake_detection_open_recoded == TRUE), n)
FN <- pull(filter(data_counts, experiment_condition == "deepfaked" & deepfake_detection_open_recoded == FALSE), n)
TN <- pull(filter(data_counts, experiment_condition == "genuine" & deepfake_detection_open_recoded == FALSE), n)
#accuracy <- (TP+TN)/(TP+TN+FP+FN)
sensitivity <- TP / (TP+FN)
false_negative_rate <- 1 - sensitivity
specificity <- TN / (TN+FP)
false_positive_rate <- 1 - specificity
# Youden's J statistic aka informedness aka "the probability of an informed decision (as opposed to a random guess) and takes into account all predictions". a zero value when a diagnostic test gives the same proportion of positive results for groups with and without the disease, i.e the test is useless.
informedness <- sensitivity + specificity - 1
balanced_accuracy <- (sensitivity + specificity)/2
results <-
tibble(variable = c(
#"accuracy",
"balanced_accuracy",
"informedness",
#"sensitivity",
"false_negative_rate",
#"specificity",
"false_positive_rate"
),
value = c(
#accuracy,
balanced_accuracy,
informedness,
#sensitivity,
false_negative_rate,
#specificity,
false_positive_rate
))
return(results)
}
# apply to each bootstrap
fit_exploratory_classification_bootstraps <- boots %>%
mutate(categorization_stats = future_map(splits, bootstrap_categorization_stats)) %>%
select(-splits) %>%
unnest(categorization_stats)
write_rds(fit_exploratory_classification_bootstraps, "models/fit_exploratory_classification_bootstraps.rds")
}classifications <- fit_exploratory_classification_bootstraps %>%
group_by(variable) %>%
summarize(ci_lower = quantile(value, 0.025),
ci_upper = quantile(value, 0.975),
.groups = "drop") %>%
full_join(fit_exploratory_classification, by = "variable") %>%
mutate_if(is.numeric, round, digits = 2) %>%
select(variable, observed, ci_lower, ci_upper)
classifications %>%
kable() %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
full_width = FALSE)| variable | observed | ci_lower | ci_upper |
|---|---|---|---|
| balanced_accuracy | 0.59 | 0.56 | 0.62 |
| false_negative_rate | 0.73 | 0.69 | 0.78 |
| false_positive_rate | 0.08 | 0.04 | 0.12 |
| informedness | 0.19 | 0.13 | 0.25 |
H3: Participants are poor at making accurate and informed judgements about whether online video content is genuine or Deepfaked. Our predictions here are descriptive/continuous rather than involving cut-off based inference rules.
H3a
We expect participants to be poor at correctly detecting Deepfakes (i.e., demonstrate a high false negative rate, FNR ≳ .80).
H3b
We expect participants to incorrectly detect Deepfakes even when the video content was real (i.e., demonstrate a high false positive rate, FPR ≳ .05).
H3c
We expect participants to be poor at making accurate decisions about whether content is genuine or not (i.e., balanced accuracy not greatly above chance, ≲ .60).
H3d
We expect participants to make poorly informed decisions about whether content is genuine or not (i.e., informedness/Youden’s J ≲ .25).
data_after_exclusions %>%
filter(deepfake_awareness_open_recoded == TRUE) %>%
count(experiment_condition,
deepfake_detection_open_recoded) %>%
drop_na() %>%
kable() %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
full_width = FALSE)| experiment_condition | deepfake_detection_open_recoded | n |
|---|---|---|
| genuine | FALSE | 96 |
| genuine | TRUE | 10 |
| deepfaked | FALSE | 82 |
| deepfaked | TRUE | 32 |
data_counts_subset <- data_after_exclusions %>%
select(experiment_condition, deepfake_detection_open_recoded) %>%
drop_na() %>%
count(experiment_condition, deepfake_detection_open_recoded)
TP <- pull(filter(data_counts_subset, experiment_condition == "deepfaked" & deepfake_detection_open_recoded == TRUE), n)
FP <- pull(filter(data_counts_subset, experiment_condition == "genuine" & deepfake_detection_open_recoded == TRUE), n)
FN <- pull(filter(data_counts_subset, experiment_condition == "deepfaked" & deepfake_detection_open_recoded == FALSE), n)
TN <- pull(filter(data_counts_subset, experiment_condition == "genuine" & deepfake_detection_open_recoded == FALSE), n)
#accuracy <- (TP+TN)/(TP+TN+FP+FN)
sensitivity <- TP / (TP+FN)
false_negative_rate <- 1 - sensitivity
specificity <- TN / (TN+FP)
false_positive_rate <- 1 - specificity
# Youden's J statistic aka informedness aka "the probability of an informed decision (as opposed to a random guess) and takes into account all predictions". a zero value when a diagnostic test gives the same proportion of positive results for groups with and without the disease, i.e the test is useless.
informedness <- sensitivity + specificity - 1
balanced_accuracy <- (sensitivity + specificity)/2
fit_exploratory_classification_subset <-
tibble(variable = c(
#"accuracy",
"balanced_accuracy",
"informedness",
#"sensitivity",
"false_negative_rate",
#"specificity",
"false_positive_rate"
),
observed = c(
#accuracy,
balanced_accuracy,
informedness,
#sensitivity,
false_negative_rate,
#specificity,
false_positive_rate
))if(file.exists("models/fit_exploratory_classification_bootstraps_subset.rds")){
fit_exploratory_classification_bootstraps_subset <- read_rds("models/fit_exploratory_classification_bootstraps_subset.rds")
} else {
# create bootstraps using out of bag method. makes a df with values that are collapsed dfs.
boots <- data_after_exclusions %>%
filter(deepfake_awareness_open_recoded == TRUE) %>%
select(experiment_condition, deepfake_detection_open_recoded) %>%
drop_na() %>%
bootstraps(times = 2000)
# generalize to a summarize function ------
bootstrap_categorization_stats <- function(split) {
data_counts <- analysis(split) %>%
count(experiment_condition, deepfake_detection_open_recoded)
TP <- pull(filter(data_counts, experiment_condition == "deepfaked" & deepfake_detection_open_recoded == TRUE), n)
FP <- pull(filter(data_counts, experiment_condition == "genuine" & deepfake_detection_open_recoded == TRUE), n)
FN <- pull(filter(data_counts, experiment_condition == "deepfaked" & deepfake_detection_open_recoded == FALSE), n)
TN <- pull(filter(data_counts, experiment_condition == "genuine" & deepfake_detection_open_recoded == FALSE), n)
#accuracy <- (TP+TN)/(TP+TN+FP+FN)
sensitivity <- TP / (TP+FN)
false_negative_rate <- 1 - sensitivity
specificity <- TN / (TN+FP)
false_positive_rate <- 1 - specificity
# Youden's J statistic aka informedness aka "the probability of an informed decision (as opposed to a random guess) and takes into account all predictions". a zero value when a diagnostic test gives the same proportion of positive results for groups with and without the disease, i.e the test is useless.
informedness <- sensitivity + specificity - 1
balanced_accuracy <- (sensitivity + specificity)/2
results <-
tibble(variable = c(
#"accuracy",
"balanced_accuracy",
"informedness",
#"sensitivity",
"false_negative_rate",
#"specificity",
"false_positive_rate"
),
value = c(
#accuracy,
balanced_accuracy,
informedness,
#sensitivity,
false_negative_rate,
#specificity,
false_positive_rate
))
return(results)
}
# apply to each bootstrap
fit_exploratory_classification_bootstraps_subset <- boots %>%
mutate(categorization_stats = future_map(splits, bootstrap_categorization_stats)) %>%
select(-splits) %>%
unnest(categorization_stats)
write_rds(fit_exploratory_classification_bootstraps_subset, "models/fit_exploratory_classification_bootstraps_subset.rds")
}classifications_subset <- fit_exploratory_classification_bootstraps_subset %>%
group_by(variable) %>%
summarize(ci_lower = quantile(value, 0.025),
ci_upper = quantile(value, 0.975),
.groups = "drop") %>%
full_join(fit_exploratory_classification_subset, by = "variable") %>%
mutate_if(is.numeric, round, digits = 2) %>%
select(variable, observed, ci_lower, ci_upper)
classifications_subset %>%
kable() %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
full_width = FALSE)| variable | observed | ci_lower | ci_upper |
|---|---|---|---|
| balanced_accuracy | 0.59 | 0.54 | 0.64 |
| false_negative_rate | 0.73 | 0.64 | 0.80 |
| false_positive_rate | 0.08 | 0.05 | 0.15 |
| informedness | 0.19 | 0.08 | 0.28 |
I.e., using the full sample and reporting the sample percentage.
Description of sample:
percent_aware <- data_after_exclusions %>%
dplyr::select(deepfake_awareness_open_recoded) %>%
drop_na() %>%
count(deepfake_awareness_open_recoded) %>%
mutate(counts = n,
awareness = as.factor(deepfake_awareness_open_recoded),
percent_aware = round(counts/sum(counts)*100, 1)) %>%
filter(awareness == "TRUE") %>%
dplyr::select(percent_aware)
percent_aware %>%
kable() %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
full_width = FALSE)| percent_aware |
|---|
| 53.5 |
Putting aside true negatives and false positive, does prior awareness of the concept of deepfaking at least make people better at detecting deepfakes
It would of course be possible include data from both experiment_conditions and add it to the model, however interpreting the two and three way interactions is less intuitive. Given this question is of secondary importance, I we therefore elected for the simpler analysis focusing on awareness and the FNR/TPR.
# convert data to counts
data_counts_awareness_detection <- data_after_exclusions %>%
filter(experiment_condition == "deepfaked") %>%
dplyr::select(experiment, deepfake_awareness_open_recoded, deepfake_detection_open_recoded) %>%
drop_na() %>%
count(experiment, deepfake_awareness_open_recoded, deepfake_detection_open_recoded) %>%
group_by(experiment) %>%
mutate(counts = n,
awareness = as.factor(deepfake_awareness_open_recoded),
detection = as.factor(deepfake_detection_open_recoded),
proportion = counts/sum(counts)) %>%
ungroup() %>%
dplyr::select(experiment, awareness, detection, counts, proportion)
# total counts is needed later to convert to proportions
total_counts_awareness_detection <- data_counts_awareness_detection %>%
group_by(experiment) %>%
summarize(total = sum(counts))
# fit poisson model
fit_exploratory_poisson_awareness_detection <-
brm(formula = counts ~ 1 + awareness * detection + (1 | experiment),
family = poisson(),
data = data_counts_awareness_detection,
file = "models/fit_exploratory_poisson_awareness_detection",
prior = prior(normal(0, 10)),
iter = 10000,
warmup = 3000,
control = list(adapt_delta = 0.998,
max_treedepth = 18), # to avoid divergent transitions
chains = 4,
cores = parallel::detectCores())## Family: poisson
## Links: mu = log
## Formula: counts ~ 1 + awareness * detection + (1 | experiment)
## Data: data_counts_awareness_detection (Number of observations: 8)
## Samples: 4 chains, each with iter = 10000; warmup = 3000; thin = 1;
## total post-warmup samples = 28000
##
## Group-Level Effects:
## ~experiment (Number of levels: 2)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 0.69 0.97 0.01 3.43 1.00 4885 9539
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
## Intercept 3.71 0.65 2.33 5.14 1.00 6693
## awarenessTRUE 0.01 0.16 -0.30 0.32 1.00 14248
## detectionTRUE -1.94 0.31 -2.59 -1.36 1.00 11346
## awarenessTRUE:detectionTRUE 0.99 0.38 0.26 1.74 1.00 10820
## Tail_ESS
## Intercept 5754
## awarenessTRUE 14911
## detectionTRUE 12932
## awarenessTRUE:detectionTRUE 12992
##
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
Using Gelman’s (2019) simple heuristic: For each parameter, compare the posterior sd to the prior sd. If the posterior sd for any parameter is more than 0.1 times the prior sd, then note that the prior was informative.
check_prior(fit_exploratory_poisson_awareness_detection) %>%
kable() %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
full_width = FALSE)| Parameter | Prior_Quality |
|---|---|
| b_Intercept | uninformative |
| b_awarenessTRUE | uninformative |
| b_detectionTRUE | uninformative |
| b_awarenessTRUE.detectionTRUE | uninformative |
# posterior draws for parameters (for results table)
draws_awareness_detection <- posterior_samples(fit_exploratory_poisson_awareness_detection) %>%
dplyr::select(awarenessTRUE = b_awarenessTRUE,
detectionTRUE = b_detectionTRUE,
interaction = `b_awarenessTRUE:detectionTRUE`)
estimates_awareness_detection <-
full_join(as_tibble(map_estimate(draws_awareness_detection)),
as_tibble(bayestestR::hdi(draws_awareness_detection, ci = .95)),
by = "Parameter") %>%
# exponentiate the log IRR values to IRR
mutate_if(is.numeric, exp) %>%
full_join(draws_awareness_detection %>%
gather(Parameter, value) %>%
group_by(Parameter) %>%
summarize(pd = mean(exp(value) > 1)) %>%
mutate(p = pd_to_p(pd, direction = "one-sided")) %>%
ungroup() %>%
select(Parameter, p),
by = "Parameter") %>%
dplyr::select(Parameter, incidence_rate_ratio_MAP = MAP_Estimate, CI_95_lower = CI_low, CI_95_upper = CI_high, p)
# convert from odds to probability
# mutate_if(is.numeric, function(x){x/(1+x)}) %>%
# table
estimates_awareness_detection %>%
mutate_at(vars("incidence_rate_ratio_MAP", "CI_95_lower", "CI_95_upper"), round, digits = 2) %>%
kable() %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
full_width = FALSE)| Parameter | incidence_rate_ratio_MAP | CI_95_lower | CI_95_upper | p |
|---|---|---|---|---|
| awarenessTRUE | 1.01 | 0.75 | 1.38 | 0.4682143 |
| detectionTRUE | 0.15 | 0.08 | 0.26 | 0.0000000 |
| interaction | 2.58 | 1.27 | 5.59 | 0.0034643 |
# hypothesis testing
H4 <- ifelse((estimates_awareness_detection %>% filter(Parameter == "interaction") %>% pull(CI_95_lower)) > 1,
"Accepted", "Rejected")
comparison_string_awareness_detection <-
paste0("Individuals who were aware of the concept of Deepfakes prior to participating in the study were ",
estimates_awareness_detection %>% filter(Parameter == "interaction") %>% pull(incidence_rate_ratio_MAP) %>% round(1),
" times more likely to detect that they had been shown a deepfake than those who were not aware of the concept (Incidence Rate Ratio = ",
estimates_awareness_detection %>% filter(Parameter == "interaction") %>% pull(incidence_rate_ratio_MAP) %>% round(2),
", 95% CI [",
estimates_awareness_detection %>% filter(Parameter == "interaction") %>% pull(CI_95_lower) %>% round(2),
", ",
estimates_awareness_detection %>% filter(Parameter == "interaction") %>% pull(CI_95_upper) %>% round(2),
"])")H4
Participants who reporting being aware of the concept of Deepfakes prior to participating in the experiment will be better at detectin Deepfakes when presented with one. Specifically, we will use participants in the Deepfake condition to calculate cell counts for the Deepfake concept check and Deepfake detection questions (i.e., what number of participants were observed for each combination of those variables). We will then use a Bayesian Poisson model (model 5) to estimate a 95% Confidence Interval on the Incidence Rate Ratio for the interaction effect between Deepfakes concept awareness and deepfake detection. A Confidence Interval whose lower bound is > 1 will be considered evidence in support of this hypothesis.
posterior_predictions_awareness_detection <-
tibble(experiment = c("5", "6"),
awareness = c("TRUE", "FALSE"),
detection = c("TRUE", "FALSE")) %>%
data_grid(experiment, awareness, detection) %>%
add_predicted_draws(model = fit_exploratory_poisson_awareness_detection, re_formula = NULL) %>%
rename(predicted_count = .prediction) %>%
left_join(total_counts_awareness_detection, by = "experiment") %>%
mutate(predicted_probabiity = predicted_count/total) %>%
ungroup() %>%
dplyr::select(experiment, awareness, detection, predicted_count, predicted_probabiity)
posterior_predictions_awareness_detection_aT_dT <- posterior_predictions_awareness_detection %>%
filter(awareness == "TRUE" & detection == "TRUE")
posterior_predictions_awareness_detection_aT_dF <- posterior_predictions_awareness_detection %>%
filter(awareness == "TRUE" & detection == "FALSE")
posterior_predictions_awareness_detection_aF_dT <- posterior_predictions_awareness_detection %>%
filter(awareness == "FALSE" & detection == "TRUE")
posterior_predictions_awareness_detection_aF_dF <- posterior_predictions_awareness_detection %>%
filter(awareness == "FALSE" & detection == "FALSE")
results_detection_probabilities <-
rbind(
bind_cols(as_tibble(map_estimate(posterior_predictions_awareness_detection_aT_dT$predicted_probabiity)),
as_tibble(bayestestR::hdi(posterior_predictions_awareness_detection_aT_dT$predicted_probabiity,
ci = .95))) %>%
mutate(awareness = "TRUE", detection = "TRUE"),
bind_cols(as_tibble(map_estimate(posterior_predictions_awareness_detection_aT_dF$predicted_probabiity)),
as_tibble(bayestestR::hdi(posterior_predictions_awareness_detection_aT_dF$predicted_probabiity,
ci = .95))) %>%
mutate(awareness = "TRUE", detection = "FALSE"),
bind_cols(as_tibble(map_estimate(posterior_predictions_awareness_detection_aF_dT$predicted_probabiity)),
as_tibble(bayestestR::hdi(posterior_predictions_awareness_detection_aF_dT$predicted_probabiity,
ci = .95))) %>%
mutate(awareness = "FALSE", detection = "TRUE"),
bind_cols(as_tibble(map_estimate(posterior_predictions_awareness_detection_aF_dF$predicted_probabiity)),
as_tibble(bayestestR::hdi(posterior_predictions_awareness_detection_aF_dF$predicted_probabiity,
ci = .95))) %>%
mutate(awareness = "FALSE", detection = "FALSE")
) %>%
dplyr::select(awareness, detection, detection_probability_MAP = value,
CI_95_lower = CI_low, CI_95_upper = CI_high) %>%
mutate_if(is.numeric, round, digits = 3)
results_detection_probabilities %>%
kable() %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
full_width = FALSE)| awareness | detection | detection_probability_MAP | CI_95_lower | CI_95_upper |
|---|---|---|---|---|
| TRUE | TRUE | 0.144 | 0.067 | 0.250 |
| TRUE | FALSE | 0.379 | 0.260 | 0.567 |
| FALSE | TRUE | 0.058 | 0.010 | 0.115 |
| FALSE | FALSE | 0.390 | 0.231 | 0.538 |
Subset who received deepfaked videos and were aware of the concept prior to the experiment. Same Bayesian multilevel models as employed above, using only source_valence as IV, i.e., to detect whether learning effects are credibly non-zero in this subset.
data_aware_subset_n <- data_after_exclusions %>%
filter(experiment_condition == "deepfaked" & !is.na(deepfake_awareness_open_recoded)) %>%
count(deepfake_awareness_open_recoded) %>%
mutate(proportion = round(n/sum(n), 2)) %>%
arrange(desc(proportion))
data_aware_subset_n %>%
kable() %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
full_width = FALSE)| deepfake_awareness_open_recoded | n | proportion |
|---|---|---|
| TRUE | 114 | 0.55 |
| FALSE | 93 | 0.45 |
fit_exploratory_selfreport_deepfaked_aware <-
brm(formula = mean_self_reported_evaluation ~ source_valence + (1 | experiment),
family = gaussian(),
data = data_aware_subset,
file = "models/fit_exploratory_selfreport_deepfaked_aware",
prior = prior(normal(0, 10)),
iter = 10000,
warmup = 3000,
control = list(adapt_delta = 0.99), # to avoid divergent transitions
chains = 4,
cores = parallel::detectCores())## Family: gaussian
## Links: mu = identity; sigma = identity
## Formula: mean_self_reported_evaluation ~ source_valence + (1 | experiment)
## Data: data_aware_subset (Number of observations: 114)
## Samples: 4 chains, each with iter = 10000; warmup = 3000; thin = 1;
## total post-warmup samples = 28000
##
## Group-Level Effects:
## ~experiment (Number of levels: 2)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 0.78 0.98 0.01 3.40 1.00 4823 9909
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
## Intercept -1.60 0.62 -3.00 -0.19 1.00 6824
## source_valencepositive 3.07 0.20 2.67 3.47 1.00 17063
## Tail_ESS
## Intercept 5694
## source_valencepositive 16352
##
## Family Specific Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma 1.07 0.07 0.94 1.23 1.00 16528 15469
##
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
Using Gelman’s (2019) simple heuristic: For each parameter, compare the posterior sd to the prior sd. If the posterior sd for any parameter is more than 0.1 times the prior sd, then note that the prior was informative.
check_prior(fit_exploratory_selfreport_deepfaked_aware) %>%
kable() %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
full_width = FALSE)| Parameter | Prior_Quality |
|---|---|
| b_Intercept | uninformative |
| b_source_valencepositive | uninformative |
#plot_model(fit_exploratory_selfreport_deepfaked_aware)
plot_model(fit_exploratory_selfreport_deepfaked_aware, type = "pred", terms = "source_valence")# results table
draws_sr_deepfaked_aware <-
select(spread_draws(fit_exploratory_selfreport_deepfaked_aware, b_source_valencepositive), b_source_valencepositive) %>%
rename(effect_deepfaked_aware = b_source_valencepositive)
estimates_sr_deepfaked_aware <-
map_estimate(draws_sr_deepfaked_aware) %>%
full_join(bayestestR::hdi(draws_sr_deepfaked_aware, ci = .95) %>%
rename(CI_95_lower = CI_low,
CI_95_upper = CI_high) %>%
as_tibble(),
by = "Parameter") %>%
full_join(bayestestR::hdi(draws_sr_deepfaked_aware, ci = .90) %>%
as_tibble() %>%
rename(CI_90_lower = CI_low,
CI_90_upper = CI_high),
by = "Parameter") %>%
full_join(draws_sr_deepfaked_aware %>%
gather(Parameter, value) %>%
group_by(Parameter) %>%
summarize(pd = mean(value > 0)) %>%
mutate(p = pd_to_p(pd, direction = "one-sided")) %>%
ungroup() %>%
select(Parameter, p),
by = "Parameter") %>%
select(Parameter, MAP_Estimate, CI_95_lower, CI_95_upper,
CI_90_lower, CI_90_upper, p)
bind_rows(filter(estimates_sr, Parameter %in% c("effect_deepfaked")),
estimates_sr_deepfaked_aware) %>%
mutate_at(.vars = c("MAP_Estimate", "CI_95_lower", "CI_95_upper", "CI_90_lower", "CI_90_upper"), round, digits = 2) %>%
kable() %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
full_width = FALSE)| Parameter | MAP_Estimate | CI_95_lower | CI_95_upper | CI_90_lower | CI_90_upper | p |
|---|---|---|---|---|---|---|
| effect_deepfaked | 2.80 | 2.63 | 2.96 | 2.66 | 2.93 | 0 |
| effect_deepfaked_aware | 3.08 | 2.68 | 3.48 | 2.73 | 3.40 | 0 |
# hypothesis testing
H5a <- ifelse((estimates_sr_deepfaked_aware %>% filter(Parameter == "effect_deepfaked_aware") %>%
pull(CI_95_lower)) > 0,
"Accepted", "Rejected")H5a
In the subset of participants who were shown a Deepfaked video and accurately detected that the video was Deepfaked, the content of the Deepfaked videos (i.e., Source Valence) will influence participants’ self-reported evaluations. Specifically, we will use a Bayesian linear model (model 3) to estimate a 95% Confidence Interval on standardized effect size change in self-reported evaluations between Source Valence conditions in the genuine video condition subgroup. Confidence Intervals whose lower bounds are > 0 will be considered evidence in support of this hypothesis. ######### needs editing #########
fit_exploratory_implicit_deepfaked_aware <-
brm(formula = IAT_D2 ~ source_valence + (1 | experiment),
family = gaussian(),
data = data_aware_subset,
file = "models/fit_exploratory_implicit_deepfaked_aware",
prior = prior(normal(0, 10)),
iter = 10000,
warmup = 3000,
control = list(adapt_delta = 0.99), # to avoid divergent transitions
chains = 4,
cores = parallel::detectCores())## Family: gaussian
## Links: mu = identity; sigma = identity
## Formula: IAT_D2 ~ source_valence + (1 | experiment)
## Data: data_aware_subset (Number of observations: 114)
## Samples: 4 chains, each with iter = 10000; warmup = 3000; thin = 1;
## total post-warmup samples = 28000
##
## Group-Level Effects:
## ~experiment (Number of levels: 2)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 0.76 1.00 0.01 3.54 1.00 4880 8892
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
## Intercept -0.09 0.67 -1.60 1.42 1.00 6140
## source_valencepositive 1.39 0.19 1.02 1.75 1.00 16554
## Tail_ESS
## Intercept 6124
## source_valencepositive 15509
##
## Family Specific Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma 0.99 0.07 0.87 1.13 1.00 16744 15952
##
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
Using Gelman’s (2019) simple heuristic: For each parameter, compare the posterior sd to the prior sd. If the posterior sd for any parameter is more than 0.1 times the prior sd, then note that the prior was informative.
check_prior(fit_exploratory_implicit_deepfaked_aware) %>%
kable() %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
full_width = FALSE)| Parameter | Prior_Quality |
|---|---|
| b_Intercept | uninformative |
| b_source_valencepositive | uninformative |
#plot_model(fit_exploratory_implicit_deepfaked_aware)
plot_model(fit_exploratory_implicit_deepfaked_aware, type = "pred", terms = "source_valence")# results table
draws_imp_deepfaked_aware <-
select(spread_draws(fit_exploratory_implicit_deepfaked_aware, b_source_valencepositive), b_source_valencepositive) %>%
rename(effect_deepfaked_aware = b_source_valencepositive)
estimates_imp_deepfaked_aware <-
map_estimate(draws_imp_deepfaked_aware) %>%
full_join(bayestestR::hdi(draws_imp_deepfaked_aware, ci = .95) %>%
rename(CI_95_lower = CI_low,
CI_95_upper = CI_high) %>%
as_tibble(),
by = "Parameter") %>%
full_join(bayestestR::hdi(draws_imp_deepfaked_aware, ci = .90) %>%
as_tibble() %>%
rename(CI_90_lower = CI_low,
CI_90_upper = CI_high),
by = "Parameter") %>%
full_join(draws_imp_deepfaked_aware %>%
gather(Parameter, value) %>%
group_by(Parameter) %>%
summarize(pd = mean(value > 0)) %>%
mutate(p = pd_to_p(pd, direction = "one-sided")) %>%
ungroup() %>%
select(Parameter, p),
by = "Parameter") %>%
select(Parameter, MAP_Estimate, CI_95_lower, CI_95_upper,
CI_90_lower, CI_90_upper, p)
bind_rows(filter(estimates_imp, Parameter %in% c("effect_deepfaked")),
estimates_imp_deepfaked_aware) %>%
mutate_at(.vars = c("MAP_Estimate", "CI_95_lower", "CI_95_upper", "CI_90_lower", "CI_90_upper"), round, digits = 2) %>%
kable() %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
full_width = FALSE)| Parameter | MAP_Estimate | CI_95_lower | CI_95_upper | CI_90_lower | CI_90_upper | p |
|---|---|---|---|---|---|---|
| effect_deepfaked | 1.41 | 1.23 | 1.55 | 1.26 | 1.53 | 0 |
| effect_deepfaked_aware | 1.38 | 1.02 | 1.75 | 1.08 | 1.69 | 0 |
# hypothesis testing
H5b <- ifelse((estimates_imp_deepfaked_aware %>% filter(Parameter == "effect_deepfaked_aware") %>%
pull(CI_95_lower)) > 0,
"Accepted", "Rejected")H5b
In the subset of participants who were shown a Deepfaked video and accurately detected that the video was Deepfaked, the content of the Deepfaked videos (i.e., Source Valence) will influence participants’ IAT D2 scores. Specifically, we will use a Bayesian linear model (model 4) to estimate a 95% Confidence Interval on standardized effect size change in IAT D2 scores between Source Valence conditions in the Deepfaked video condition subgroup. Confidence Intervals whose lower bounds are > 0 will be considered evidence in support of this hypothesis. ######### needs editing #########
fit_exploratory_intentions_deepfaked_aware <-
brm(formula = mean_intentions ~ source_valence, # no random effect for experiment as only exp 6 assessed intentions
family = gaussian(),
data = data_aware_subset,
file = "models/fit_exploratory_intentions_deepfaked_aware",
prior = prior(normal(0, 10)),
iter = 10000,
warmup = 3000,
control = list(adapt_delta = 0.99), # to avoid divergent transitions
chains = 4,
cores = parallel::detectCores())## Family: gaussian
## Links: mu = identity; sigma = identity
## Formula: mean_intentions ~ source_valence
## Data: data_aware_subset (Number of observations: 53)
## Samples: 4 chains, each with iter = 10000; warmup = 3000; thin = 1;
## total post-warmup samples = 28000
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
## Intercept -3.31 0.22 -3.75 -2.87 1.00 21104
## source_valencepositive 3.09 0.29 2.52 3.66 1.00 20531
## Tail_ESS
## Intercept 16876
## source_valencepositive 16552
##
## Family Specific Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma 1.04 0.11 0.86 1.28 1.00 21315 16915
##
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
Using Gelman’s (2019) simple heuristic: For each parameter, compare the posterior sd to the prior sd. If the posterior sd for any parameter is more than 0.1 times the prior sd, then note that the prior was informative.
check_prior(fit_exploratory_intentions_deepfaked_aware) %>%
kable() %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
full_width = FALSE)| Parameter | Prior_Quality |
|---|---|
| b_Intercept | uninformative |
| b_source_valencepositive | uninformative |
#plot_model(fit_exploratory_intentions_deepfaked_aware)
plot_model(fit_exploratory_intentions_deepfaked_aware, type = "pred", terms = "source_valence")# results table
draws_intentions_deepfaked_aware <-
select(spread_draws(fit_exploratory_intentions_deepfaked_aware, b_source_valencepositive), b_source_valencepositive) %>%
rename(effect_deepfaked_aware = b_source_valencepositive)
estimates_intentions_deepfaked_aware <-
map_estimate(draws_intentions_deepfaked_aware) %>%
full_join(bayestestR::hdi(draws_intentions_deepfaked_aware, ci = .95) %>%
rename(CI_95_lower = CI_low,
CI_95_upper = CI_high) %>%
as_tibble(),
by = "Parameter") %>%
full_join(bayestestR::hdi(draws_intentions_deepfaked_aware, ci = .90) %>%
as_tibble() %>%
rename(CI_90_lower = CI_low,
CI_90_upper = CI_high),
by = "Parameter") %>%
full_join(draws_intentions_deepfaked_aware %>%
gather(Parameter, value) %>%
group_by(Parameter) %>%
summarize(pd = mean(value > 0)) %>%
mutate(p = pd_to_p(pd, direction = "one-sided")) %>%
ungroup() %>%
select(Parameter, p),
by = "Parameter") %>%
select(Parameter, MAP_Estimate, CI_95_lower, CI_95_upper,
CI_90_lower, CI_90_upper, p)
bind_rows(filter(estimates_intentions, Parameter %in% c("effect_deepfaked")),
estimates_intentions_deepfaked_aware) %>%
mutate_at(.vars = c("MAP_Estimate", "CI_95_lower", "CI_95_upper", "CI_90_lower", "CI_90_upper"), round, digits = 2) %>%
kable() %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
full_width = FALSE)| Parameter | MAP_Estimate | CI_95_lower | CI_95_upper | CI_90_lower | CI_90_upper | p |
|---|---|---|---|---|---|---|
| effect_deepfaked | 1.37 | 0.99 | 1.76 | 1.05 | 1.69 | 0 |
| effect_deepfaked_aware | 3.11 | 2.53 | 3.67 | 2.62 | 3.58 | 0 |
# hypothesis testing
H5c <- ifelse((estimates_intentions_deepfaked_aware %>% filter(Parameter == "effect_deepfaked_aware") %>%
pull(CI_95_lower)) > 0,
"Accepted", "Rejected")H5c
In the subset of participants who were shown a Deepfaked video and accurately detected that the video was Deepfaked, the content of the Deepfaked videos (i.e., Source Valence) will influence participants’ behavioral intention scores. Specifically, we will use a Bayesian linear model (model 4) to estimate a 95% Confidence Interval on standardized effect size change in behavioral intention scores between Source Valence conditions in the Deepfaked video condition subgroup. Confidence Intervals whose lower bounds are > 0 will be considered evidence in support of this hypothesis. ######### needs editing #########
Subset who received deepfaked videos but also detected them. Same Bayesian multilevel models as employed above, using only source_valence as IV, i.e., to detect whether learning effects are credibly non-zero in this subset.
data_detectors_subset_n <- data_after_exclusions %>%
filter(experiment_condition == "deepfaked" & !is.na(deepfake_detection_open_recoded)) %>%
count(deepfake_detection_open_recoded) %>%
mutate(proportion = round(n/sum(n), 2)) %>%
arrange(desc(proportion))
data_detectors_subset_n %>%
kable() %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
full_width = FALSE)| deepfake_detection_open_recoded | n | proportion |
|---|---|---|
| FALSE | 273 | 0.73 |
| TRUE | 99 | 0.27 |
fit_exploratory_selfreport_deepfaked_detected <-
brm(formula = mean_self_reported_evaluation ~ source_valence + (1 | experiment),
family = gaussian(),
data = data_detectors_subset,
file = "models/fit_exploratory_selfreport_deepfaked_detected",
prior = prior(normal(0, 10)),
iter = 10000,
warmup = 3000,
control = list(adapt_delta = 0.99), # to avoid divergent transitions
chains = 4,
cores = parallel::detectCores())## Family: gaussian
## Links: mu = identity; sigma = identity
## Formula: mean_self_reported_evaluation ~ source_valence + (1 | experiment)
## Data: data_detectors_subset (Number of observations: 99)
## Samples: 4 chains, each with iter = 10000; warmup = 3000; thin = 1;
## total post-warmup samples = 28000
##
## Group-Level Effects:
## ~experiment (Number of levels: 3)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 0.42 0.50 0.01 1.84 1.00 4521 4742
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
## Intercept -1.60 0.35 -2.33 -0.84 1.00 6354
## source_valencepositive 2.75 0.24 2.28 3.22 1.00 15987
## Tail_ESS
## Intercept 4888
## source_valencepositive 16649
##
## Family Specific Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma 1.17 0.09 1.02 1.36 1.00 18604 17389
##
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
Using Gelman’s (2019) simple heuristic: For each parameter, compare the posterior sd to the prior sd. If the posterior sd for any parameter is more than 0.1 times the prior sd, then note that the prior was informative.
check_prior(fit_exploratory_selfreport_deepfaked_detected) %>%
kable() %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
full_width = FALSE)| Parameter | Prior_Quality |
|---|---|
| b_Intercept | uninformative |
| b_source_valencepositive | uninformative |
#plot_model(fit_exploratory_selfreport_deepfaked_detected)
plot_model(fit_exploratory_selfreport_deepfaked_detected, type = "pred", terms = "source_valence")# results table
draws_sr_deepfaked_detected <-
select(spread_draws(fit_exploratory_selfreport_deepfaked_detected, b_source_valencepositive), b_source_valencepositive) %>%
rename(effect_deepfaked_detected = b_source_valencepositive)
estimates_sr_deepfaked_detected <-
map_estimate(draws_sr_deepfaked_detected) %>%
full_join(bayestestR::hdi(draws_sr_deepfaked_detected, ci = .95) %>%
rename(CI_95_lower = CI_low,
CI_95_upper = CI_high) %>%
as_tibble(),
by = "Parameter") %>%
full_join(bayestestR::hdi(draws_sr_deepfaked_detected, ci = .90) %>%
as_tibble() %>%
rename(CI_90_lower = CI_low,
CI_90_upper = CI_high),
by = "Parameter") %>%
full_join(draws_sr_deepfaked_detected %>%
gather(Parameter, value) %>%
group_by(Parameter) %>%
summarize(pd = mean(value > 0)) %>%
mutate(p = pd_to_p(pd, direction = "one-sided")) %>%
ungroup() %>%
select(Parameter, p),
by = "Parameter") %>%
select(Parameter, MAP_Estimate, CI_95_lower, CI_95_upper,
CI_90_lower, CI_90_upper, p)
bind_rows(filter(estimates_sr, Parameter %in% c("effect_deepfaked")),
estimates_sr_deepfaked_detected) %>%
mutate_at(.vars = c("MAP_Estimate", "CI_95_lower", "CI_95_upper", "CI_90_lower", "CI_90_upper"), round, digits = 2) %>%
kable() %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
full_width = FALSE)| Parameter | MAP_Estimate | CI_95_lower | CI_95_upper | CI_90_lower | CI_90_upper | p |
|---|---|---|---|---|---|---|
| effect_deepfaked | 2.80 | 2.63 | 2.96 | 2.66 | 2.93 | 0 |
| effect_deepfaked_detected | 2.74 | 2.29 | 3.23 | 2.35 | 3.13 | 0 |
# hypothesis testing
H6a <- ifelse((estimates_sr_deepfaked_detected %>% filter(Parameter == "effect_deepfaked_detected") %>%
pull(CI_95_lower)) > 0,
"Accepted", "Rejected")H6a
In the subset of participants who were shown a Deepfaked video and accurately detected that the video was Deepfaked, the content of the Deepfaked videos (i.e., Source Valence) will influence participants’ self-reported evaluations. Specifically, we will use a Bayesian linear model (model 3) to estimate a 95% Confidence Interval on standardized effect size change in self-reported evaluations between Source Valence conditions in the genuine video condition subgroup. Confidence Intervals whose lower bounds are > 0 will be considered evidence in support of this hypothesis.
fit_exploratory_implicit_deepfaked_detected <-
brm(formula = IAT_D2 ~ source_valence + (1 | experiment),
family = gaussian(),
data = data_detectors_subset,
file = "models/fit_exploratory_implicit_deepfaked_detected",
prior = prior(normal(0, 10)),
iter = 10000,
warmup = 3000,
control = list(adapt_delta = 0.99), # to avoid divergent transitions
chains = 4,
cores = parallel::detectCores())## Family: gaussian
## Links: mu = identity; sigma = identity
## Formula: IAT_D2 ~ source_valence + (1 | experiment)
## Data: data_detectors_subset (Number of observations: 99)
## Samples: 4 chains, each with iter = 10000; warmup = 3000; thin = 1;
## total post-warmup samples = 28000
##
## Group-Level Effects:
## ~experiment (Number of levels: 3)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 1.05 0.86 0.25 3.31 1.00 6046 8652
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
## Intercept 0.06 0.66 -1.36 1.46 1.00 7078
## source_valencepositive 1.07 0.19 0.71 1.43 1.00 18476
## Tail_ESS
## Intercept 7543
## source_valencepositive 14850
##
## Family Specific Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma 0.91 0.07 0.79 1.06 1.00 17701 13337
##
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
Using Gelman’s (2019) simple heuristic: For each parameter, compare the posterior sd to the prior sd. If the posterior sd for any parameter is more than 0.1 times the prior sd, then note that the prior was informative.
check_prior(fit_exploratory_implicit_deepfaked_detected) %>%
kable() %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
full_width = FALSE)| Parameter | Prior_Quality |
|---|---|
| b_Intercept | uninformative |
| b_source_valencepositive | uninformative |
#plot_model(fit_exploratory_implicit_deepfaked_detected)
plot_model(fit_exploratory_implicit_deepfaked_detected, type = "pred", terms = "source_valence")# results table
draws_imp_deepfaked_detected <-
select(spread_draws(fit_exploratory_implicit_deepfaked_detected, b_source_valencepositive), b_source_valencepositive) %>%
rename(effect_deepfaked_detected = b_source_valencepositive)
estimates_imp_deepfaked_detected <-
map_estimate(draws_imp_deepfaked_detected) %>%
full_join(bayestestR::hdi(draws_imp_deepfaked_detected, ci = .95) %>%
rename(CI_95_lower = CI_low,
CI_95_upper = CI_high) %>%
as_tibble(),
by = "Parameter") %>%
full_join(bayestestR::hdi(draws_imp_deepfaked_detected, ci = .90) %>%
as_tibble() %>%
rename(CI_90_lower = CI_low,
CI_90_upper = CI_high),
by = "Parameter") %>%
full_join(draws_imp_deepfaked_detected %>%
gather(Parameter, value) %>%
group_by(Parameter) %>%
summarize(pd = mean(value > 0)) %>%
mutate(p = pd_to_p(pd, direction = "one-sided")) %>%
ungroup() %>%
select(Parameter, p),
by = "Parameter") %>%
select(Parameter, MAP_Estimate, CI_95_lower, CI_95_upper,
CI_90_lower, CI_90_upper, p)
bind_rows(filter(estimates_imp, Parameter %in% c("effect_deepfaked")),
estimates_imp_deepfaked_detected) %>%
mutate_at(.vars = c("MAP_Estimate", "CI_95_lower", "CI_95_upper", "CI_90_lower", "CI_90_upper"), round, digits = 2) %>%
kable() %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
full_width = FALSE)| Parameter | MAP_Estimate | CI_95_lower | CI_95_upper | CI_90_lower | CI_90_upper | p |
|---|---|---|---|---|---|---|
| effect_deepfaked | 1.41 | 1.23 | 1.55 | 1.26 | 1.53 | 0 |
| effect_deepfaked_detected | 1.06 | 0.70 | 1.42 | 0.76 | 1.37 | 0 |
# hypothesis testing
H6b <- ifelse((estimates_imp_deepfaked_detected %>% filter(Parameter == "effect_deepfaked_detected") %>%
pull(CI_95_lower)) > 0,
"Accepted", "Rejected")H6b
In the subset of participants who were shown a Deepfaked video and accurately detected that the video was Deepfaked, the content of the Deepfaked videos (i.e., Source Valence) will influence participants’ IAT D2 scores. Specifically, we will use a Bayesian linear model (model 4) to estimate a 95% Confidence Interval on standardized effect size change in IAT D2 scores between Source Valence conditions in the Deepfaked video condition subgroup. Confidence Intervals whose lower bounds are > 0 will be considered evidence in support of this hypothesis.
fit_exploratory_intentions_deepfaked_detected <-
brm(formula = mean_intentions ~ source_valence, # no random effect for experiment as only exp 6 assessed intentions
family = gaussian(),
data = data_detectors_subset,
file = "models/fit_exploratory_intentions_deepfaked_detected",
prior = prior(normal(0, 10)),
iter = 10000,
warmup = 3000,
control = list(adapt_delta = 0.99), # to avoid divergent transitions
chains = 4,
cores = parallel::detectCores())## Family: gaussian
## Links: mu = identity; sigma = identity
## Formula: mean_intentions ~ source_valence
## Data: data_detectors_subset (Number of observations: 24)
## Samples: 4 chains, each with iter = 10000; warmup = 3000; thin = 1;
## total post-warmup samples = 28000
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
## Intercept -3.27 0.27 -3.80 -2.73 1.00 19083
## source_valencepositive 2.71 0.42 1.89 3.54 1.00 18613
## Tail_ESS
## Intercept 14831
## source_valencepositive 15579
##
## Family Specific Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma 1.00 0.16 0.74 1.36 1.00 18144 13630
##
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
Using Gelman’s (2019) simple heuristic: For each parameter, compare the posterior sd to the prior sd. If the posterior sd for any parameter is more than 0.1 times the prior sd, then note that the prior was informative.
check_prior(fit_exploratory_intentions_deepfaked_detected) %>%
kable() %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
full_width = FALSE)| Parameter | Prior_Quality |
|---|---|
| b_Intercept | uninformative |
| b_source_valencepositive | uninformative |
#plot_model(fit_exploratory_intentions_deepfaked_detected)
plot_model(fit_exploratory_intentions_deepfaked_detected, type = "pred", terms = "source_valence")# results table
draws_intentions_deepfaked_detected <-
select(spread_draws(fit_exploratory_intentions_deepfaked_detected, b_source_valencepositive), b_source_valencepositive) %>%
rename(effect_deepfaked_detected = b_source_valencepositive)
estimates_intentions_deepfaked_detected <-
map_estimate(draws_intentions_deepfaked_detected) %>%
full_join(bayestestR::hdi(draws_intentions_deepfaked_detected, ci = .95) %>%
rename(CI_95_lower = CI_low,
CI_95_upper = CI_high) %>%
as_tibble(),
by = "Parameter") %>%
full_join(bayestestR::hdi(draws_intentions_deepfaked_detected, ci = .90) %>%
as_tibble() %>%
rename(CI_90_lower = CI_low,
CI_90_upper = CI_high),
by = "Parameter") %>%
full_join(draws_intentions_deepfaked_detected %>%
gather(Parameter, value) %>%
group_by(Parameter) %>%
summarize(pd = mean(value > 0)) %>%
mutate(p = pd_to_p(pd, direction = "one-sided")) %>%
ungroup() %>%
select(Parameter, p),
by = "Parameter") %>%
select(Parameter, MAP_Estimate, CI_95_lower, CI_95_upper,
CI_90_lower, CI_90_upper, p)
bind_rows(filter(estimates_intentions, Parameter %in% c("effect_deepfaked")),
estimates_intentions_deepfaked_detected) %>%
mutate_at(.vars = c("MAP_Estimate", "CI_95_lower", "CI_95_upper", "CI_90_lower", "CI_90_upper"), round, digits = 2) %>%
kable() %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
full_width = FALSE)| Parameter | MAP_Estimate | CI_95_lower | CI_95_upper | CI_90_lower | CI_90_upper | p |
|---|---|---|---|---|---|---|
| effect_deepfaked | 1.37 | 0.99 | 1.76 | 1.05 | 1.69 | 0 |
| effect_deepfaked_detected | 2.77 | 1.88 | 3.52 | 2.03 | 3.38 | 0 |
# hypothesis testing
H6c <- ifelse((estimates_intentions_deepfaked_detected %>% filter(Parameter == "effect_deepfaked_detected") %>%
pull(CI_95_lower)) > 0,
"Accepted", "Rejected")H6c
In the subset of participants who were shown a Deepfaked video and accurately detected that the video was Deepfaked, the content of the Deepfaked videos (i.e., Source Valence) will influence participants’ behavioral intention scores. Specifically, we will use a Bayesian linear model (model 4) to estimate a 95% Confidence Interval on standardized effect size change in behavioral intention scores between Source Valence conditions in the Deepfaked video condition subgroup. Confidence Intervals whose lower bounds are > 0 will be considered evidence in support of this hypothesis.
Subset who received deepfaked videos, were aware of the concept before the study, and also detected them. Same Bayesian multilevel models as employed above, using only source_valence as IV, i.e., to detect whether learning effects are credibly non-zero in this subset.
data_aware_detectors_subset_n <- data_after_exclusions %>%
filter(experiment_condition == "deepfaked" &
!is.na(deepfake_detection_open_recoded) &
!is.na(deepfake_awareness_open_recoded)) %>%
count(deepfake_detection_open_recoded, deepfake_awareness_open_recoded) %>%
mutate(proportion = round(n/sum(n), 2)) %>%
arrange(desc(proportion))
data_aware_detectors_subset_n %>%
kable() %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
full_width = FALSE)| deepfake_detection_open_recoded | deepfake_awareness_open_recoded | n | proportion |
|---|---|---|---|
| FALSE | TRUE | 82 | 0.40 |
| FALSE | FALSE | 81 | 0.39 |
| TRUE | TRUE | 32 | 0.15 |
| TRUE | FALSE | 12 | 0.06 |
fit_exploratory_selfreport_deepfaked_aware_detected <-
brm(formula = mean_self_reported_evaluation ~ source_valence + (1 | experiment),
family = gaussian(),
data = data_aware_detectors_subset,
file = "models/fit_exploratory_selfreport_deepfaked_aware_detected",
prior = prior(normal(0, 10)),
iter = 10000,
warmup = 3000,
control = list(adapt_delta = 0.99), # to avoid divergent transitions
chains = 4,
cores = parallel::detectCores())## Family: gaussian
## Links: mu = identity; sigma = identity
## Formula: mean_self_reported_evaluation ~ source_valence + (1 | experiment)
## Data: data_aware_detectors_subset (Number of observations: 32)
## Samples: 4 chains, each with iter = 10000; warmup = 3000; thin = 1;
## total post-warmup samples = 28000
##
## Group-Level Effects:
## ~experiment (Number of levels: 2)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 0.92 1.04 0.02 3.76 1.00 6834 11383
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
## Intercept -1.67 0.79 -3.40 -0.03 1.00 9078
## source_valencepositive 3.26 0.48 2.30 4.22 1.00 16897
## Tail_ESS
## Intercept 8457
## source_valencepositive 15216
##
## Family Specific Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma 1.34 0.18 1.04 1.76 1.00 17130 15103
##
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
Using Gelman’s (2019) simple heuristic: For each parameter, compare the posterior sd to the prior sd. If the posterior sd for any parameter is more than 0.1 times the prior sd, then note that the prior was informative.
check_prior(fit_exploratory_selfreport_deepfaked_aware_detected) %>%
kable() %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
full_width = FALSE)| Parameter | Prior_Quality |
|---|---|
| b_Intercept | uninformative |
| b_source_valencepositive | uninformative |
#plot_model(fit_exploratory_selfreport_deepfaked_aware_detected)
plot_model(fit_exploratory_selfreport_deepfaked_aware_detected, type = "pred", terms = "source_valence")# results table
draws_sr_deepfaked_aware_detected <-
select(spread_draws(fit_exploratory_selfreport_deepfaked_aware_detected, b_source_valencepositive), b_source_valencepositive) %>%
rename(effect_deepfaked_aware_detected = b_source_valencepositive)
estimates_sr_deepfaked_aware_detected <-
map_estimate(draws_sr_deepfaked_aware_detected) %>%
full_join(bayestestR::hdi(draws_sr_deepfaked_aware_detected, ci = .95) %>%
rename(CI_95_lower = CI_low,
CI_95_upper = CI_high) %>%
as_tibble(),
by = "Parameter") %>%
full_join(bayestestR::hdi(draws_sr_deepfaked_aware_detected, ci = .90) %>%
as_tibble() %>%
rename(CI_90_lower = CI_low,
CI_90_upper = CI_high),
by = "Parameter") %>%
full_join(draws_sr_deepfaked_aware_detected %>%
gather(Parameter, value) %>%
group_by(Parameter) %>%
summarize(pd = mean(value > 0)) %>%
mutate(p = pd_to_p(pd, direction = "one-sided")) %>%
ungroup() %>%
select(Parameter, p),
by = "Parameter") %>%
select(Parameter, MAP_Estimate, CI_95_lower, CI_95_upper,
CI_90_lower, CI_90_upper, p)
bind_rows(filter(estimates_sr, Parameter %in% c("effect_deepfaked")),
estimates_sr_deepfaked_aware_detected) %>%
mutate_at(.vars = c("MAP_Estimate", "CI_95_lower", "CI_95_upper", "CI_90_lower", "CI_90_upper"), round, digits = 2) %>%
kable() %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
full_width = FALSE)| Parameter | MAP_Estimate | CI_95_lower | CI_95_upper | CI_90_lower | CI_90_upper | p |
|---|---|---|---|---|---|---|
| effect_deepfaked | 2.80 | 2.63 | 2.96 | 2.66 | 2.93 | 0 |
| effect_deepfaked_aware_detected | 3.25 | 2.35 | 4.26 | 2.48 | 4.06 | 0 |
# hypothesis testing
H7a <- ifelse((estimates_sr_deepfaked_aware_detected %>% filter(Parameter == "effect_deepfaked_aware_detected") %>%
pull(CI_95_lower)) > 0,
"Accepted", "Rejected")H7a
In the subset of participants who were shown a Deepfaked video and accurately detected that the video was Deepfaked, the content of the Deepfaked videos (i.e., Source Valence) will influence participants’ self-reported evaluations. Specifically, we will use a Bayesian linear model (model 3) to estimate a 95% Confidence Interval on standardized effect size change in self-reported evaluations between Source Valence conditions in the genuine video condition subgroup. Confidence Intervals whose lower bounds are > 0 will be considered evidence in support of this hypothesis. ######### needs editing #########
fit_exploratory_implicit_deepfaked_aware_detected <-
brm(formula = IAT_D2 ~ source_valence + (1 | experiment),
family = gaussian(),
data = data_aware_detectors_subset,
file = "models/fit_exploratory_implicit_deepfaked_aware_detected",
prior = prior(normal(0, 10)),
iter = 10000,
warmup = 3000,
control = list(adapt_delta = 0.99), # to avoid divergent transitions
chains = 4,
cores = parallel::detectCores())## Family: gaussian
## Links: mu = identity; sigma = identity
## Formula: IAT_D2 ~ source_valence + (1 | experiment)
## Data: data_aware_detectors_subset (Number of observations: 32)
## Samples: 4 chains, each with iter = 10000; warmup = 3000; thin = 1;
## total post-warmup samples = 28000
##
## Group-Level Effects:
## ~experiment (Number of levels: 2)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 1.37 1.28 0.12 4.66 1.00 6234 7638
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
## Intercept -0.24 0.98 -2.39 1.74 1.00 7766
## source_valencepositive 1.23 0.34 0.56 1.90 1.00 15998
## Tail_ESS
## Intercept 7573
## source_valencepositive 12518
##
## Family Specific Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma 0.94 0.13 0.73 1.24 1.00 13717 14561
##
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
Using Gelman’s (2019) simple heuristic: For each parameter, compare the posterior sd to the prior sd. If the posterior sd for any parameter is more than 0.1 times the prior sd, then note that the prior was informative.
check_prior(fit_exploratory_implicit_deepfaked_aware_detected) %>%
kable() %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
full_width = FALSE)| Parameter | Prior_Quality |
|---|---|
| b_Intercept | uninformative |
| b_source_valencepositive | uninformative |
#plot_model(fit_exploratory_implicit_deepfaked_aware_detected)
plot_model(fit_exploratory_implicit_deepfaked_aware_detected, type = "pred", terms = "source_valence")# results table
draws_imp_deepfaked_aware_detected <-
select(spread_draws(fit_exploratory_implicit_deepfaked_aware_detected, b_source_valencepositive), b_source_valencepositive) %>%
rename(effect_deepfaked_aware_detected = b_source_valencepositive)
estimates_imp_deepfaked_aware_detected <-
map_estimate(draws_imp_deepfaked_aware_detected) %>%
full_join(bayestestR::hdi(draws_imp_deepfaked_aware_detected, ci = .95) %>%
rename(CI_95_lower = CI_low,
CI_95_upper = CI_high) %>%
as_tibble(),
by = "Parameter") %>%
full_join(bayestestR::hdi(draws_imp_deepfaked_aware_detected, ci = .90) %>%
as_tibble() %>%
rename(CI_90_lower = CI_low,
CI_90_upper = CI_high),
by = "Parameter") %>%
full_join(draws_imp_deepfaked_aware_detected %>%
gather(Parameter, value) %>%
group_by(Parameter) %>%
summarize(pd = mean(value > 0)) %>%
mutate(p = pd_to_p(pd, direction = "one-sided")) %>%
ungroup() %>%
select(Parameter, p),
by = "Parameter") %>%
select(Parameter, MAP_Estimate, CI_95_lower, CI_95_upper,
CI_90_lower, CI_90_upper, p)
bind_rows(filter(estimates_imp, Parameter %in% c("effect_deepfaked")),
estimates_imp_deepfaked_aware_detected) %>%
mutate_at(.vars = c("MAP_Estimate", "CI_95_lower", "CI_95_upper", "CI_90_lower", "CI_90_upper"), round, digits = 2) %>%
kable() %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
full_width = FALSE)| Parameter | MAP_Estimate | CI_95_lower | CI_95_upper | CI_90_lower | CI_90_upper | p |
|---|---|---|---|---|---|---|
| effect_deepfaked | 1.41 | 1.23 | 1.55 | 1.26 | 1.53 | 0.0000000 |
| effect_deepfaked_aware_detected | 1.22 | 0.54 | 1.88 | 0.68 | 1.78 | 0.0003214 |
# hypothesis testing
H7b <- ifelse((estimates_imp_deepfaked_aware_detected %>% filter(Parameter == "effect_deepfaked_aware_detected") %>%
pull(CI_95_lower)) > 0,
"Accepted", "Rejected")H7b
In the subset of participants who were shown a Deepfaked video and accurately detected that the video was Deepfaked, the content of the Deepfaked videos (i.e., Source Valence) will influence participants’ IAT D2 scores. Specifically, we will use a Bayesian linear model (model 4) to estimate a 95% Confidence Interval on standardized effect size change in IAT D2 scores between Source Valence conditions in the Deepfaked video condition subgroup. Confidence Intervals whose lower bounds are > 0 will be considered evidence in support of this hypothesis. ######### needs editing #########
fit_exploratory_intentions_deepfaked_aware_detected <-
brm(formula = mean_intentions ~ source_valence, # no random effect for experiment as only exp 6 assessed intentions
family = gaussian(),
data = data_aware_detectors_subset,
file = "models/fit_exploratory_intentions_deepfaked_aware_detected",
prior = prior(normal(0, 10)),
iter = 10000,
warmup = 3000,
control = list(adapt_delta = 0.99), # to avoid divergent transitions
chains = 4,
cores = parallel::detectCores())## Family: gaussian
## Links: mu = identity; sigma = identity
## Formula: mean_intentions ~ source_valence
## Data: data_aware_detectors_subset (Number of observations: 17)
## Samples: 4 chains, each with iter = 10000; warmup = 3000; thin = 1;
## total post-warmup samples = 28000
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
## Intercept -3.15 0.36 -3.86 -2.43 1.00 17518
## source_valencepositive 2.46 0.53 1.39 3.52 1.00 17793
## Tail_ESS
## Intercept 14708
## source_valencepositive 14680
##
## Family Specific Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma 1.08 0.22 0.75 1.60 1.00 15668 15286
##
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
Using Gelman’s (2019) simple heuristic: For each parameter, compare the posterior sd to the prior sd. If the posterior sd for any parameter is more than 0.1 times the prior sd, then note that the prior was informative.
check_prior(fit_exploratory_intentions_deepfaked_aware_detected) %>%
kable() %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
full_width = FALSE)| Parameter | Prior_Quality |
|---|---|
| b_Intercept | uninformative |
| b_source_valencepositive | uninformative |
#plot_model(fit_exploratory_intentions_deepfaked_aware_detected)
plot_model(fit_exploratory_intentions_deepfaked_aware_detected, type = "pred", terms = "source_valence")# results table
draws_intentions_deepfaked_aware_detected <-
select(spread_draws(fit_exploratory_intentions_deepfaked_aware_detected, b_source_valencepositive), b_source_valencepositive) %>%
rename(effect_deepfaked_aware_detected = b_source_valencepositive)
estimates_intentions_deepfaked_aware_detected <-
map_estimate(draws_intentions_deepfaked_aware_detected) %>%
full_join(bayestestR::hdi(draws_intentions_deepfaked_aware_detected, ci = .95) %>%
rename(CI_95_lower = CI_low,
CI_95_upper = CI_high) %>%
as_tibble(),
by = "Parameter") %>%
full_join(bayestestR::hdi(draws_intentions_deepfaked_aware_detected, ci = .90) %>%
as_tibble() %>%
rename(CI_90_lower = CI_low,
CI_90_upper = CI_high),
by = "Parameter") %>%
full_join(draws_intentions_deepfaked_aware_detected %>%
gather(Parameter, value) %>%
group_by(Parameter) %>%
summarize(pd = mean(value > 0)) %>%
mutate(p = pd_to_p(pd, direction = "one-sided")) %>%
ungroup() %>%
select(Parameter, p),
by = "Parameter") %>%
select(Parameter, MAP_Estimate, CI_95_lower, CI_95_upper,
CI_90_lower, CI_90_upper, p)
bind_rows(filter(estimates_intentions, Parameter %in% c("effect_deepfaked")),
estimates_intentions_deepfaked_aware_detected) %>%
mutate_at(.vars = c("MAP_Estimate", "CI_95_lower", "CI_95_upper", "CI_90_lower", "CI_90_upper"), round, digits = 2) %>%
kable() %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
full_width = FALSE)| Parameter | MAP_Estimate | CI_95_lower | CI_95_upper | CI_90_lower | CI_90_upper | p |
|---|---|---|---|---|---|---|
| effect_deepfaked | 1.37 | 0.99 | 1.76 | 1.05 | 1.69 | 0 |
| effect_deepfaked_aware_detected | 2.44 | 1.39 | 3.52 | 1.58 | 3.32 | 0 |
# hypothesis testing
H7c <- ifelse((estimates_intentions_deepfaked_aware_detected %>% filter(Parameter == "effect_deepfaked_aware_detected") %>%
pull(CI_95_lower)) > 0,
"Accepted", "Rejected")H7c
In the subset of participants who were shown a Deepfaked video and accurately detected that the video was Deepfaked, the content of the Deepfaked videos (i.e., Source Valence) will influence participants’ behavioral intention scores. Specifically, we will use a Bayesian linear model (model 4) to estimate a 95% Confidence Interval on standardized effect size change in behavioral intention scores between Source Valence conditions in the Deepfaked video condition subgroup. Confidence Intervals whose lower bounds are > 0 will be considered evidence in support of this hypothesis. ######### needs editing #########
H1: Establishing first impressions via online video content
H2: Are deepfakes just as good as the real thing?
H3: How good are people at detecting whether content is genuine or Deepfaked?
Whole sample
classifications %>%
kable() %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
full_width = FALSE)| variable | observed | ci_lower | ci_upper |
|---|---|---|---|
| balanced_accuracy | 0.59 | 0.56 | 0.62 |
| false_negative_rate | 0.73 | 0.69 | 0.78 |
| false_positive_rate | 0.08 | 0.04 | 0.12 |
| informedness | 0.19 | 0.13 | 0.25 |
Those who were aware of the concept prior to the study
classifications_subset %>%
kable() %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
full_width = FALSE)| variable | observed | ci_lower | ci_upper |
|---|---|---|---|
| balanced_accuracy | 0.59 | 0.54 | 0.64 |
| false_negative_rate | 0.73 | 0.64 | 0.80 |
| false_positive_rate | 0.08 | 0.05 | 0.15 |
| informedness | 0.19 | 0.08 | 0.28 |
H4: Does prior awareness of the concept of Deepfakes make people better at detecting them?
H5-7: Does being aware of the concept, detecitng the deepfake, or both make you immune to a Deepfake?
## R version 4.0.2 (2020-06-22)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS 10.16
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRblas.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
##
## locale:
## [1] en_IE.UTF-8/en_IE.UTF-8/en_IE.UTF-8/C/en_IE.UTF-8/en_IE.UTF-8
##
## attached base packages:
## [1] parallel stats graphics grDevices utils datasets methods
## [8] base
##
## other attached packages:
## [1] furrr_0.2.1 future_1.19.1 modelr_0.1.8 semTools_0.5-3
## [5] lavaan_0.6-7 IATscores_0.2.7 broom_0.7.2 rsample_0.0.7
## [9] psych_2.0.9 sjPlot_2.8.4 bayestestR_0.7.5 tidybayes_2.0.3
## [13] brms_2.14.0 Rcpp_1.0.5 kableExtra_1.3.1 knitr_1.30
## [17] forcats_0.5.0 stringr_1.4.0 dplyr_1.0.2 purrr_0.3.4
## [21] readr_1.3.1 tidyr_1.1.2 tibble_3.0.4 ggplot2_3.3.2
## [25] tidyverse_1.3.0
##
## loaded via a namespace (and not attached):
## [1] tidyselect_1.1.0 lme4_1.1-25 htmlwidgets_1.5.1
## [4] grid_4.0.2 munsell_0.5.0 codetools_0.2-16
## [7] effectsize_0.4.0 statmod_1.4.34 DT_0.13
## [10] miniUI_0.1.1.1 withr_2.3.0 Brobdingnag_1.2-6
## [13] colorspace_2.0-0 highr_0.8 rstudioapi_0.13
## [16] stats4_4.0.2 bayesplot_1.7.2 listenv_0.8.0
## [19] labeling_0.4.2 huge_1.3.4.1 emmeans_1.4.6
## [22] rstan_2.21.2 mnormt_1.5-7 farver_2.0.3
## [25] bridgesampling_1.0-0 coda_0.19-3 vctrs_0.3.5
## [28] generics_0.0.2 TH.data_1.0-10 xfun_0.19
## [31] R6_2.5.0 markdown_1.1 assertthat_0.2.1
## [34] promises_1.1.0 scales_1.1.1 multcomp_1.4-13
## [37] nnet_7.3-14 gtable_0.3.0 globals_0.13.1
## [40] processx_3.4.4 sandwich_2.5-1 rlang_0.4.8
## [43] splines_4.0.2 checkmate_2.0.0 inline_0.3.16
## [46] yaml_2.2.1 reshape2_1.4.4 abind_1.4-5
## [49] d3Network_0.5.2.1 threejs_0.3.3 crosstalk_1.1.0.1
## [52] backports_1.1.9 httpuv_1.5.2 rsconnect_0.8.16
## [55] Hmisc_4.4-1 tools_4.0.2 ellipsis_0.3.1
## [58] RColorBrewer_1.1-2 ggridges_0.5.2 plyr_1.8.6
## [61] base64enc_0.1-3 ps_1.4.0 prettyunits_1.1.1
## [64] rpart_4.1-15 pbapply_1.4-2 zoo_1.8-8
## [67] qgraph_1.6.5 haven_2.3.1 cluster_2.1.0
## [70] fs_1.4.1 magrittr_2.0.1 data.table_1.13.2
## [73] colourpicker_1.0 reprex_0.3.0 mvtnorm_1.1-1
## [76] whisker_0.4 sjmisc_2.8.5 matrixStats_0.56.0
## [79] hms_0.5.3 shinyjs_1.1 mime_0.9
## [82] evaluate_0.14 arrayhelpers_1.1-0 xtable_1.8-4
## [85] shinystan_2.5.0 sjstats_0.18.0 jpeg_0.1-8.1
## [88] readxl_1.3.1 gridExtra_2.3 ggeffects_0.14.3
## [91] rstantools_2.1.1 compiler_4.0.2 V8_3.2.0
## [94] crayon_1.3.4 minqa_1.2.4 StanHeaders_2.21.0-6
## [97] htmltools_0.5.0 corpcor_1.6.9 later_1.0.0
## [100] Formula_1.2-3 RcppParallel_5.0.2 lubridate_1.7.9
## [103] DBI_1.1.0 sjlabelled_1.1.7 dbplyr_1.4.3
## [106] MASS_7.3-53 boot_1.3-25 Matrix_1.2-18
## [109] cli_2.1.0 insight_0.10.0 igraph_1.2.5
## [112] BDgraph_2.62 pkgconfig_2.0.3 foreign_0.8-80
## [115] xml2_1.3.2 svUnit_1.0.3 pbivnorm_0.6.0
## [118] dygraphs_1.1.1.6 webshot_0.5.2 estimability_1.3
## [121] rvest_0.3.5 snakecase_0.11.0 callr_3.5.1
## [124] digest_0.6.27 parameters_0.8.6 rmarkdown_2.5
## [127] cellranger_1.1.0 htmlTable_1.13.3 curl_4.3
## [130] shiny_1.5.0 gtools_3.8.2 rjson_0.2.20
## [133] nloptr_1.2.2.2 glasso_1.11 lifecycle_0.2.0
## [136] nlme_3.1-148 jsonlite_1.7.1 viridisLite_0.3.0
## [139] fansi_0.4.1 pillar_1.4.6 lattice_0.20-41
## [142] loo_2.3.1 fastmap_1.0.1 httr_1.4.1
## [145] pkgbuild_1.1.0 survival_3.1-12 glue_1.4.2
## [148] xts_0.12-0 fdrtool_1.2.15 png_0.1-7
## [151] shinythemes_1.1.2 stringi_1.4.6 performance_0.4.6
## [154] latticeExtra_0.6-29